SoFunction
Updated on 2024-10-29

Implementation of a center-of-mass tracking algorithm using OpenCV

The process of target tracking:

1、Get the initial set of object detection

2. Create a unique ID for each initial test

3. Then track the movement of each object in the video frame, keeping the unique ID assignment

In this paper, we use OpenCV to implement center-of-mass tracking, an easy-to-understand but efficient tracking algorithm.

Steps of the center-of-mass tracking algorithm

Step 1: Accept the coordinates of the bounding box and calculate the center of mass

The center-of-mass tracking algorithm assumes that we pass a set of bounding box (x, y) coordinates for each detected object in each frame.

These bounding boxes can be generated by any type of object detector (Color Thresholding + Outline Extraction, Haar Cascade, HOG + Linear SVM, SSD, Faster R-CNN, etc.), provided they are specific to the video.

The center of mass (the center (x, y) coordinates of the bounding box) can be calculated from the coordinates of the bounding box. The figure above demonstrates the calculation of the center of mass from a set of bounding box coordinates.

Assigns an ID to the initial bounding box.

Step 2: Calculate the Euclidean distance between the new bounding box and the existing objects

For each subsequent frame in the video stream, step 1 is first performed; we then first need to determine if we can associate the new object center of mass (yellow) with the old object center of mass (purple), rather than assigning a new unique ID to each detected object (which would defeat the purpose of object tracking). To complete this process, we compute the Euclidean distance between each pair of existing object centers of mass and the input object center of mass (highlighted with a green arrow).

The above figure shows that we have detected three objects in the image this time. The two pairs close together are two existing objects.

We then calculate the Euclidean distance between the original center of mass (yellow) and the new center of mass (purple) for each pair. But how do we use the Euclidean distances between these points to actually match them and associate them?

The answer is in step 3.

Step 3: Update the (x, y) coordinates of an existing object

The main assumption of the center-of-mass tracking algorithm is that a given object may move between subsequent frames, but the distance between the centers of mass of frames F_t and F_{t + 1} will be less than all other distances between objects.

Thus, if we choose to associate the center of mass with the minimum distance between subsequent frames, we can construct our object tracker.

In the figure above, one can see how the center-of-mass tracker algorithm selects associated centers of mass to minimize their respective Euclidean distances.

But what about the lonely spot in the lower left corner?

It's not associated with anything - what do we do with it?

Step 4: Registering a new object

If there are more input detections than there are existing objects to track, we need to register the new objects. "Registering" simply means that we add the new object to our list of tracked objects in the following way:

Assign a new object ID to it

The center of mass that stores the bounding box coordinates of this object.

We can then return to step 2 and repeat the execution.

The figure above demonstrates the process of associating an existing object ID using the minimum Euclidean distance and then registering the new object.

Step 5: Logging off old objects

When the old object is out of scope, cancel the old object,

Project structure

Center-of-mass tracking with OpenCV

New , write the code:

# import the necessary packages
from  import distance as dist
from collections import OrderedDict
import numpy as np

class CentroidTracker():
    def __init__(self, maxDisappeared=50):

         = 0
         = OrderedDict()
         = OrderedDict()
        # Store the maximum number of consecutive frames a given object is allowed to be marked as "disappeared" until we need to log the object out of the trace.
         = maxDisappeared

The required packages and modules were imported: distance, OrderedDict and numpy.

Define class CentroidTracker . The constructor accepts a parameter maxDisappeared, which is the maximum number of consecutive frames that must be lost/disappeared for a given object, and removes the object if this value is exceeded.

Four class variables:

nextObjectID: A counter used to assign a unique ID to each object. If an object leaves the frame and does not return to the maxDisappeared frame, a new (next) object ID will be assigned.

objects : Dictionary with object IDs as keys and center-of-mass (x, y) coordinates as values.

disappeared: Holds the number of consecutive frames (values) for which the specific object ID (key) has been marked as "lost".

maxDisappeared : The number of consecutive frames allowed to mark an object as "lost/disappeared" before we unregister it.

Let's define the register method that is responsible for adding new objects to our tracker:

def register(self, centroid):
        # When registering an object, we use the next available object ID to store the center of mass
        [] = centroid
        [] = 0
         += 1

    def deregister(self, objectID):
        # To deregister the object ID, we removed it from both dictionaries
        del [objectID]
        del [objectID]

The register method accepts a center and adds it to the object dictionary using the next available object ID.

The number of times an object has disappeared is initialized to 0 in the disappearance dictionary.

Finally, we add nextObjectID so that if a new object comes into view, it will be associated with a unique ID.

Similar to our registration method, we need a logout method.

The deregister method removes the object and disappears the objectID from the dictionary, respectively.

The core of the center of mass tracker implementation is located in the update method

 def update(self, rects):
        # Check if the list of input bounding box rectangles is empty
        if len(rects) == 0:
            # Iterate over any existing tracking objects and mark them as gone
            for objectID in list(()):
                [objectID] += 1
                # Deregister if the maximum number of consecutive frames marked as lost for a given object is reached
                if [objectID] > :
                    (objectID)
            # Return as soon as possible as there is no center of gravity or tracking information to update
            return 

        # Initialize the input center of mass array for the current frame
        inputCentroids = ((len(rects), 2), dtype="int")
        # Loop over the bounding box rectangle
        for (i, (startX, startY, endX, endY)) in enumerate(rects):
            # use the bounding box coordinates to derive the centroid
            cX = int((startX + endX) / 2.0)
            cY = int((startY + endY) / 2.0)
            inputCentroids[i] = (cX, cY)

        # If we are not currently tracking any objects, enter the input center of mass and register each center of mass
        if len() == 0:
            for i in range(0, len(inputCentroids)):
                (inputCentroids[i])
        # Otherwise, the object is currently being tracked, so we need to try to match the input center of mass to an existing object center of mass
        else:
            # Grab a set of object IDs and corresponding centers of mass
            objectIDs = list(())
            objectCentroids = list(())
            # Calculate the distance between each pair of object centers of mass and the input center of mass separately - our goal is to match the input center of mass to the existing object center of mass
            D = ((objectCentroids), inputCentroids)
            # In order to perform this match, we must (1) find the minimum value in each row, and then (2) sort the row indexes according to the minimum value of the row index so that the row with the minimum value is at the * front * of the index list
            rows = (axis=1).argsort()
            # Next, we perform a similar process on the columns by finding the minimum value in each column and then sorting it using the list of row indexes previously calculated
            cols = (axis=1)[rows]
            # In order to determine if an object needs to be updated, registered, or deregistered, we need to keep track of the row and column indexes that have been examined
            usedRows = set()
            usedCols = set()

            # Combination of loop traversal (rows, columns) index tuples
            for (row, col) in zip(rows, cols):
                # Ignore the value of a row or column if we've checked it before
                if row in usedRows or col in usedCols:
                    continue
                # Otherwise, get the object ID of the current row, set its new center of mass, and then reset the vanishing counter
                objectID = objectIDs[row]
                [objectID] = inputCentroids[col]
                [objectID] = 0
                # Indicates that we have checked the row indexes and column indexes separately
                (row)
                (col)
            # Calculate row and column indexes that we haven't checked yet
            unusedRows = set(range(0, [0])).difference(usedRows)
            unusedCols = set(range(0, [1])).difference(usedCols)
            # If the number of object centers is equal to or greater than the number of input centers
            # We need to check if some of these objects have potentially disappeared #
            if [0] >= [1]:
                # loop over the unused row indexes
                for row in unusedRows:
                    # Grab the object ID of the corresponding row index and increase the disappearing counter
                    objectID = objectIDs[row]
                    [objectID] += 1
                    # of consecutive frames to check if the object has been marked as "gone" and the number of warrants to cancel the object
                    if [objectID] > :
                        (objectID)
            # Otherwise, if the number of input plasmoids is greater than the number of existing object plasmoids, we need to register each new input plasmoid as a trackable object
            else:
                for col in unusedCols:
                    (inputCentroids[col])

        # return the set of trackable objects
        return 

The update method accepts a list of bounding box rectangles. The format of the rects argument is assumed to be a tuple with the following structure: (startX, startY, endX, endY).

If not detected, we will iterate through all object IDs and increase their disappearance count. We will also check to see if the maximum number of consecutive frames a given object can be marked as lost has been reached. If so, we need to remove it from our tracking system. Since there is no tracking information to update, we will return early.

Otherwise, we initialize a NumPy array to store the center of mass of each rect. We then iterate over the bounding box rectangles and compute the centers of mass and store them in the inputCentroids list. If there are no objects that we are traversing, we register each new object.

Otherwise, we need to update any existing object (x, y) coordinates based on the center-of-mass position that minimizes the Euclidean distance between them.

Next we compute the Euclidean distance between all pairs of objectCentroids and inputCentroids in else:

Gets the objectID and objectCentroid values.

Calculate the distance between each pair of existing object centroids and the new input centroids. The shape of the output NumPy array of distance graphs D will be (# of object centroids, # of input centroids). To perform a match, we must (1) find the minimum value in each row, and (2) sort the row indexes according to the minimum value. We perform a very similar process for columns, finding the minimum value in each column and then sorting the rows according to the sorted rows. Our goal is to have the index value with the smallest corresponding distance at the front of the list.

The next step is to use the distance to see if an object ID can be associated:

Initialize both collections to determine which row and column indexes we have used.

Then iterate over the combination of (row, col) index tuples to update our object center of mass:

If we have already used this row or column index, ignore it and continue the loop.

Otherwise, we find an input center of mass:

  1. Minimum Euclidean distance to existing center of mass
  2. and does not match any other object
  3. In this case, we update the object center of mass (lines 113-115) and make sure to add row and col to their respective usedRows and usedCols collections.

There may be indexes in our usedRows + usedCols collection that we haven't checked yet:

So we have to determine which center-of-mass indexes we haven't checked yet and store them in two new convenience collections (unusedRows and usedCols).

The final check takes care of any missing or possibly disappearing objects:

If the number of object centers is greater than or equal to the number of input centers:

We need to loop through the unused row indexes to verify that these objects are not missing or disappearing.

In the loop, we will:

1. Increase the number of times they disappear from the dictionary.

2. Check if the disappearance count exceeds the maxDisappeared threshold, if so, we will cancel the object.

Otherwise, the number of input centers is greater than the number of existing object centers and we have new objects to register and track.

The loop traverses the unused Cols index and registers each new center of mass. Finally, we return the set of traceable objects to the calling method.

Implementing Object Trace Driver Scripts

The CentroidTracker class has been implemented, let's use it with the object tracking driver script.

The driver script is where you can use your favorite object detector, provided it generates a set of bounding boxes. This could be Haar Cascade, HOG + Linear SVM, YOLO, SSD, Faster R-CNN, etc.

In this script, the functionality to be implemented:

1、Use real-time VideoStream object to grab frames from webcam.

2. Load and use OpenCV's deep learning face detector.

3. Instantiate CentroidTracker and use it to track face objects in the video stream and display the results.

Create a new object_tracker.py and insert the code:

from  import CentroidTracker
import numpy as np
import imutils
import time
import cv2

# Define the lowest confidence level
confidence_t=0.5
# Initialize the center of mass tracker and frame dimensions
ct = CentroidTracker()
(H, W) = (None, None)
# Load the model for detecting faces
print("[INFO] loading model...")
net = ("", "res10_300x300_ssd_iter_140000_fp16.caffemodel")
# Initialize the video stream and allow the camera sensor to warm up
print("[INFO] starting video stream...")
vs = ('11.mp4')
(2.0)
fps = 30    #Save the FPS of the video, which can be adjusted appropriately
size=(600,1066)#Width and height, determined from the width and height of the frame.
fourcc = cv2.VideoWriter_fourcc(*"mp4v")
videoWriter = ('3.mp4',fourcc,fps,size)#The last one saves the size of the image

Import the required packages.

Define the minimum confidence level.

Load face detection model

Initialize the video stream or camera (setting to the camera's corresponding ID will start the camera)

The next parameter defined.

# Loop through frames in the image stream
while True:
    # Read the next frame from the video stream and resize it
    (grabbed, frame)  = ()
    if not grabbed:
        break
    frame = (frame, width=600)
    print()
    # If the dimensions in the frame are "none", grab them
    if W is None or H is None:
        (H, W) = [:2]
    # Construct a Blob from the frame, pass it over the network, #
    # Get the output prediction and initialize the list of bounding box rectangles
    blob = (frame, 1.0, (W, H),
                                 (104.0, 177.0, 123.0))
    (blob)
    detections = ()
    rects = []
    # Cycle detection
    for i in range(0, [2]):
        # Filter out weak detections by ensuring that the predicted probability is greater than the minimum threshold
        if detections[0, 0, i, 2] >0.5:
            # Calculate the (x, y) coordinates of the object's bounding box, then update the list of bounding box rectangles
            box = detections[0, 0, i, 3:7] * ([W, H, W, H])
            (("int"))
            # Draw a bounding box around the object so we can visualize it
            (startX, startY, endX, endY) = ("int")
            (frame, (startX, startY), (endX, endY),
                          (0, 255, 0), 2)
    # Update the center of mass tracker using the computed set of bounding box rectangles
    objects = (rects)
    # Cyclic tracking objects
    for (objectID, centroid) in ():
        # Draw the object's ID and the object's center of mass on the output frame
        text = "ID {}".format(objectID)
        (frame, text, (centroid[0] - 10, centroid[1] - 10),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
        (frame, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)
    # Display output screen
    ("Frame", frame)
    (frame)
    key = (1) & 0xFF
    # If the "q" key is pressed, the loop is exited.
    if key == ord("q"):
        break
()
()
()

Iterate through the frames and resize them to a fixed width (while maintaining the aspect ratio).

The frame is then passed through a CNN object detector to obtain predictions and object locations (.

Initializes a list of rects, the bounding box rectangles.

Loop through the detections and if the detection exceeds our confidence threshold, indicating that the detection is valid, then calculate the border.

Call the update method on the center-of-mass tracker object ct.

Next draw the object's ID and center of mass on the output frame, showing the center of mass as a solid circle and the text of the unique object ID number. Now we will be able to visualize the results and check that CentroidTracker is tracking the object correctly by associating the correct ID with the object in the video stream.

Limitations and disadvantages

While the center-of-mass tracker works well in this example, this object tracking algorithm has two major drawbacks.

1. It requires that the object detection step be run on every frame of the input video. For very fast object detectors (i.e., color thresholding and Haar cascading), having to run the detector on every input frame may not be an issue. However, if a much more computationally intensive object detector, such as HOG + linear SVM or a deep learning based detector, is used on a resource constrained device, then frame processing will be slowed down considerably.

2. has to do with the basic assumption of the center-of-mass tracking algorithm itself - that the center of mass must be close between subsequent frames.

This assumption usually holds true, but remember that we use 2D frames to represent our 3D world - what happens when one object overlaps another?

The answer is that object ID switching may occur. If two or more objects overlap each other to the point where their centers of mass intersect, but instead are at a minimum distance from another corresponding object, the algorithm may (unknowingly) swap object IDs.It is important to understand that the overlapping/obstructing object problem is not specific to center-of-mass tracking - it also occurs in many other object trackers, including advanced object trackers. However, the problem with center-of-mass tracking is more pronounced because we rely strictly on Euclidean distances between the centers of mass, and there are no additional metrics, heuristics, or learning patterns.

Above is the use of OpenCV to realize the details of the center of mass tracking algorithm, more information about OpenCV center of mass tracking algorithm please pay attention to my other related articles!