OpenCV for multi-target tracking
- Project framework
- Code block analysis
- Full code and resource download connection
Experimental framework, modules used
The experimental framework is as follows
├── mobilenet_ssd
├── MobileNetSSD_deploy.caffemodel
└── MobileNetSSD_deploy.prototxt
├── multi_object_tracking_fast.py
├── race.mp4
├── race_output_slow.avi
└── race_output_fast.avi
Modules used
Python 3, OpenCV, dlib, python multiprocessing
Code block analysis
Import required libraries
from imutils.video import FPS import multiprocessing import numpy as np import argparse import imutils import dlib import cv2
Generate a new Process by using python's Process class (each Process is independent of the original Process)
Process function definition
Python multiprocessing means that Python will call this function and then create a new interpreter to execute the code in it. As a result, each process that starts the tracer generation will be independent of its parent process. To communicate with Python driver scripts, we need to use pipes or queues. Both types of objects are thread / process safe, using locks and semaphores.
In essence, we are building a simple producer / consumer relationship:
The parent process generates new frames and adds them to the queue of a specific object tracker.
The subprocess uses frames, applies object tracking, and then returns updated bounding box coordinates.
First, we will try to get a new frame from the inputQueue on line 21.
If the frame is not empty, we get the frame and update the object tracker to get the updated bounding box coordinates (lines 24-34).
Finally, we add tags and bounding boxes to the outputQueue so that the parent process can take advantage of them in the main loop of the script (line 38).
Back to the parent process, we will analyze the command line parameters:
def start_tracker(box, label, rgb, inputQueue, outputQueue): ''' brief : Construct a from bounding box coordinates dlib Rectangle object, then start the correlation tracker :param box: Bounding box coordinates returned by detector :param label: Detected object label :param rgb: Start initial dlib Object tracker's RGB Ordered image :param inputQueue: Input queue :param outputQueue: Output queue :return: null ''' # t is a tracker object t = dlib.correlation_tracker() rect = dlib.rectangle(box[0], box[1], box[2], box[3]) t.start_track(rgb, rect) # This function will be called as a daemons, so don't worry about join ing it while True: # Try to get the next frame from the input queue rgb = inputQueue.get() # If there is an item in the queue, process it if rgb is not None: # Update the tracker and get the location of the tracking object t.update(rgb) pos = t.get_position() # Analysis object location startX = int(pos.left()) startY = int(pos.top()) endX = int(pos.right()) endY = int(pos.bottom()) # Add the object label and bounding box to the output queue, and the coordinate information is the tuple passed outputQueue.put((label, (startX, startY, endX, endY)))
Parsing command line parameters
– prototype: the path to the Caffe "deploy" prototype file.
– model: the path to the model file that accompanies the prototype.
– Video: enter the path of the video file. We will use dlib to perform multi-target tracking in this video.
– output: optional path to output video file. If no path is specified, the video is not output to disk. I recommend exporting to an. avi or. mp4 file.
– confidence: an optional override with an object detection confidence threshold of 0.2. This value represents the minimum probability of filtering weak detection from the target detector.
# Construct command line arguments and parse them ap = argparse.ArgumentParser() ap.add_argument("-p", "--prototxt", required=True, help="path to Caffe 'deploy' prototxt file") ap.add_argument("-m", "--model", required=True, help="path to Caffe pre-trained model") ap.add_argument("-v", "--video", required=True, help="path to input video file") ap.add_argument("-o", "--output", type=str, help="path to optional output video file") ap.add_argument("-c", "--confidence", type=float, default=0.2, help="minimum probability to filter weak detections") args = vars(ap.parse_args())
Initialize I / O queue
These queues will hold the objects we are tracking. Each generated process requires two queue objects:
One read input frame
Second, write the result
# Initializing the I / O queue, we need to track every object inputQueues = [] outputQueues = []
Initialize other content
# Initializing the list of detected objects trained by Caffe network CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"] # Load serialization model from disk print("[INFO] loading model...") net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"]) # initialize the video stream and output video writer # Initialize video streaming and output video objects print("[INFO] starting video stream...") vs = cv2.VideoCapture(args["video"]) writer = None # Start FPS throughput counter to calculate FPS value of video fps = FPS().start()
Start reading video stream
# Traverse every frame of the video stream while True: # Get frame (grabbed, frame) = vs.read() # Determine whether it is the last frame if frame is None: break # Resize frame to RGB format frame = imutils.resize(frame, width=600) rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # If you want to have output video, initialize writer to write video if args["output"] is not None and writer is None: fourcc = cv2.VideoWriter_fourcc(*"MJPG") writer = cv2.VideoWriter(args["output"], fourcc, 30, (frame.shape[1], frame.shape[0]), True)
Handle cases where there is no input queue
If there is no input queue (line 101), then we know we need to apply object detection before object tracking.
We apply target detection on lines 103-109 and loop the results on line 112. We take our confidence values and filter out the weak detection on lines 115-119.
If our confidence level meets the threshold established by the command line parameters, we will consider detection, but we will further filter it out through the class tag. In this case, we only look for the "person" object (lines 122-127).
Assuming we find a "person," we will create a queue and generate a trace process:
We first calculate the bounding box coordinates of lines 131-133.
From here we create two new queues, iq and oq (lines 137 and 138), which we attach to inputQueues and outputQueues (lines 139 and 140), respectively.
From there we generate a new start tracking process through bounding boxes, tags, rgb images, and iq+oq (lines 143-147). Don't forget to read more about multiprocessing here.
We also draw the border rectangle and class label of the detected object (lines 151-154).
Otherwise, we have performed object detection, so we need to apply each dlib object tracker to the frame:
# If the input queue is empty, then we haven't created the first object tracker yet if len(inputQueues) == 0: # Get the size of the frame and convert it to blob (the basic unit of caffe model detection) (h, w) = frame.shape[:2] # Normalization: 0.007843 is equal to 1 / 127.5 blob = cv2.dnn.blobFromImage(frame, 0.007843, (w, h), 127.5) # Input the blob into the network, get the detection result, and return the 4-dimensional matrix ''' [0 0 i 1]:Confidence [0 0 i 2]:Tag No [0 0 i 3:7]:Test object's xy coordinate ''' net.setInput(blob) detections = net.forward() # ergodic matrix for i in np.arange(0, detections.shape[2]): # Extract the correlation confidence, i.e. the probability of detecting the object confidence = detections[0, 0, i, 2] # Filter out the weak detection target by setting the minimum confidence if confidence > args["confidence"]: # Get the detected object label, corresponding to the previous class label sequence number idx = int(detections[0, 0, i, 1]) label = CLASSES[idx] # Here we mainly detect people. We can also set parameters so that we can detect objects in class labels if CLASSES[idx] != "person": continue # Calculate the coordinates of the bounding box of the target object box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) (startX, startY, endX, endY) = box.astype("int") bb = (startX, startY, endX, endY) # Create two new input and output queues iq = multiprocessing.Queue() oq = multiprocessing.Queue() inputQueues.append(iq) outputQueues.append(oq) # Generating daemons for new object trackers p = multiprocessing.Process( target=start_tracker, args=(bb, label, rgb, iq, oq)) p.daemon = True p.start() # Grab the corresponding class label for detection, and draw the bounding box, green, two-dimensional cv2.rectangle(frame, (startX, startY), (endX, endY), (0, 255, 0), 2) # Put the class name cv2.putText(frame, label, (startX, startY - 15), cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)
Input queue is not empty
Loop through each input queue and we add rgb images to them (lines 162 and 163).
Then loop through each output queue (line 166) to get bounding box coordinates from each individual object tracker (line 171). Finally, we draw the bounding box + related class labels on lines 175-178.
# The output queue is not empty, indicating that we have test results, so we continue to track multiple objects else: # Loop and add input RGB frames on each input queue, enabling us to update every object tracker running in different processes for iq in inputQueues: iq.put(rgb) # Loop on each output queue for oq in outputQueues: # Grab to the updated bounding box # The get method is a blocking operation, so this will pause our execution until the process finishes tracking the update (label, (startX, startY, endX, endY)) = oq.get() # Draw a bounding box from the tracker cv2.rectangle(frame, (startX, startY), (endX, endY), (0, 255, 0), 2) cv2.putText(frame, label, (startX, startY - 15), cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)
Complete the last cycle
If necessary, we write the frame to the output video and display the frame to the screen (lines 181-185).
If you press the "q" key, we "exit" and jump out of the loop (lines 186-190).
If we continue to work on frames, the fps calculator will update at line 193, and then start working again at the beginning of the while loop.
Otherwise, we will process the frame and display FPS throughput information + release pointer and close the window.
# Determine whether video output is required if writer is not None: writer.write(frame) # Play the processed frame cv2.imshow("Frame", frame) key = cv2.waitKey(1) & 0xFF # Press' q 'to exit if key == ord("q"): break # Update FPS counter fps.update() # Stop counter and display FPS value fps.stop() print("[INFO] elapsed time: {:.2f}".format(fps.elapsed())) print("[INFO] approx. FPS: {:.2f}".format(fps.fps())) # Release write pointer if writer is not None: writer.release() # Closing work cv2.destroyAllWindows() vs.release()
Complete code and reference website and Library Download
Reference website: https://www.pyimagesearch.com/2018/10/29/multi-object-tracking-with-dlib/
Download address of code model video: https://download.csdn.net/download/qq903952032/12516940