Motion-Based Multiple Object Tracking

时间:2023-12-31 15:12:57

kalman filter tracking...

%% Motion-Based Multiple Object Tracking
% This example shows how to perform automatic detection and motion-based
% tracking of moving objects in a video from a stationary camera.
%
% Copyright The MathWorks, Inc. %%
% Detection of moving objects and motion-based tracking are important
% components of many computer vision applications, including activity
% recognition, traffic monitoring, and automotive safety. The problem of
% motion-based object tracking can be divided into two parts:
%
% # detecting moving objects in each frame
% # associating the detections corresponding to the same object over time
%
% The detection of moving objects uses a background subtraction algorithm
% based on Gaussian mixture models. Morphological operations are applied to
% the resulting foreground mask to eliminate noise. Finally, blob analysis
% detects groups of connected pixels, which are likely to correspond to
% moving objects.
%
% The association of detections to the same object is based solely on
% motion. The motion of each track is estimated by a Kalman filter. The
% filter is used to predict the track's location in each frame, and
% determine the likelihood of each detection being assigned to each
% track.
%
% Track maintenance becomes an important aspect of this example. In any
% given frame, some detections may be assigned to tracks, while other
% detections and tracks may remain unassigned.The assigned tracks are
% updated using the corresponding detections. The unassigned tracks are
% marked invisible. An unassigned detection begins a new track.
%
% Each track keeps count of the number of consecutive frames, where it
% remained unassigned. If the count exceeds a specified threshold, the
% example assumes that the object left the field of view and it deletes the
% track.
%
% For more information please see
% <matlab:helpview(fullfile(docroot,'toolbox','vision','vision.map'),'multipleObjectTracking') Multiple Object Tracking>.
%
% This example is a function with the main body at the top and helper
% routines in the form of
% <matlab:helpview(fullfile(docroot,'toolbox','matlab','matlab_prog','matlab_prog.map'),'nested_functions') nested functions>
% below. function multiObjectTracking() % Create System objects used for reading video, detecting moving objects,
% and displaying the results.
obj = setupSystemObjects(); tracks = initializeTracks(); % Create an empty array of tracks. nextId = ; % ID of the next track % Detect moving objects, and track them across video frames.
while ~isDone(obj.reader)
frame = readFrame();
[centroids, bboxes, mask] = detectObjects(frame);
predictNewLocationsOfTracks();
[assignments, unassignedTracks, unassignedDetections] = ...
detectionToTrackAssignment(); updateAssignedTracks();
updateUnassignedTracks();
deleteLostTracks();
createNewTracks(); displayTrackingResults();
end %% Create System Objects
% Create System objects used for reading the video frames, detecting
% foreground objects, and displaying results. function obj = setupSystemObjects()
% Initialize Video I/O
% Create objects for reading a video from a file, drawing the tracked
% objects in each frame, and playing the video. % Create a video file reader.
obj.reader = vision.VideoFileReader('atrium.avi'); % Create two video players, one to display the video,
% and one to display the foreground mask.
obj.videoPlayer = vision.VideoPlayer('Position', [, , , ]);
obj.maskPlayer = vision.VideoPlayer('Position', [, , , ]); % Create System objects for foreground detection and blob analysis % The foreground detector is used to segment moving objects from
% the background. It outputs a binary mask, where the pixel value
% of corresponds to the foreground and the value of corresponds
% to the background. obj.detector = vision.ForegroundDetector('NumGaussians', , ...
'NumTrainingFrames', , 'MinimumBackgroundRatio', 0.7); % Connected groups of foreground pixels are likely to correspond to moving
% objects. The blob analysis System object is used to find such groups
% (called 'blobs' or 'connected components'), and compute their
% characteristics, such as area, centroid, and the bounding box. obj.blobAnalyser = vision.BlobAnalysis('BoundingBoxOutputPort', true, ...
'AreaOutputPort', true, 'CentroidOutputPort', true, ...
'MinimumBlobArea', );
end %% Initialize Tracks
% The |initializeTracks| function creates an array of tracks, where each
% track is a structure representing a moving object in the video. The
% purpose of the structure is to maintain the state of a tracked object.
% The state consists of information used for detection to track assignment,
% track termination, and display.
%
% The structure contains the following fields:
%
% * |id| : the integer ID of the track
% * |bbox| : the current bounding box of the object; used
% for display
% * |kalmanFilter| : a Kalman filter object used for motion-based
% tracking
% * |age| : the number of frames since the track was first
% detected
% * |totalVisibleCount| : the total number of frames in which the track
% was detected (visible)
% * |consecutiveInvisibleCount| : the number of consecutive frames for
% which the track was not detected (invisible).
%
% Noisy detections tend to result in short-lived tracks. For this reason,
% the example only displays an object after it was tracked for some number
% of frames. This happens when |totalVisibleCount| exceeds a specified
% threshold.
%
% When no detections are associated with a track for several consecutive
% frames, the example assumes that the object has left the field of view
% and deletes the track. This happens when |consecutiveInvisibleCount|
% exceeds a specified threshold. A track may also get deleted as noise if
% it was tracked for a short time, and marked invisible for most of the of
% the frames. function tracks = initializeTracks()
% create an empty array of tracks
tracks = struct(...
'id', {}, ...
'bbox', {}, ...
'kalmanFilter', {}, ...
'age', {}, ...
'totalVisibleCount', {}, ...
'consecutiveInvisibleCount', {});
end %% Read a Video Frame
% Read the next video frame from the video file.
function frame = readFrame()
frame = obj.reader.step();
end %% Detect Objects
% The |detectObjects| function returns the centroids and the bounding boxes
% of the detected objects. It also returns the binary mask, which has the
% same size as the input frame. Pixels with a value of correspond to the
% foreground, and pixels with a value of correspond to the background.
%
% The function performs motion segmentation using the foreground detector.
% It then performs morphological operations on the resulting binary mask to
% remove noisy pixels and to fill the holes in the remaining blobs. function [centroids, bboxes, mask] = detectObjects(frame) % Detect foreground.
mask = obj.detector.step(frame); % Apply morphological operations to remove noise and fill in holes.
mask = imopen(mask, strel('rectangle', [,]));
mask = imclose(mask, strel('rectangle', [, ]));
mask = imfill(mask, 'holes'); % Perform blob analysis to find connected components.
[~, centroids, bboxes] = obj.blobAnalyser.step(mask);
end %% Predict New Locations of Existing Tracks
% Use the Kalman filter to predict the centroid of each track in the
% current frame, and update its bounding box accordingly. function predictNewLocationsOfTracks()
for i = :length(tracks)
bbox = tracks(i).bbox; % Predict the current location of the track.
predictedCentroid = predict(tracks(i).kalmanFilter); % Shift the bounding box so that its center is at
% the predicted location.
predictedCentroid = int32(predictedCentroid) - bbox(:) / ;
tracks(i).bbox = [predictedCentroid, bbox(:)];
end
end %% Assign Detections to Tracks
% Assigning object detections in the current frame to existing tracks is
% done by minimizing cost. The cost is defined as the negative
% log-likelihood of a detection corresponding to a track.
%
% The algorithm involves two steps:
%
% Step : Compute the cost of assigning every detection to each track using
% the |distance| method of the |vision.KalmanFilter| System object(TM). The
% cost takes into account the Euclidean distance between the predicted
% centroid of the track and the centroid of the detection. It also includes
% the confidence of the prediction, which is maintained by the Kalman
% filter. The results are stored in an MxN matrix, where M is the number of
% tracks, and N is the number of detections.
%
% Step : Solve the assignment problem represented by the cost matrix using
% the |assignDetectionsToTracks| function. The function takes the cost
% matrix and the cost of not assigning any detections to a track.
%
% The value for the cost of not assigning a detection to a track depends on
% the range of values returned by the |distance| method of the
% |vision.KalmanFilter|. This value must be tuned experimentally. Setting
% it too low increases the likelihood of creating a new track, and may
% result in track fragmentation. Setting it too high may result in a single
% track corresponding to a series of separate moving objects.
%
% The |assignDetectionsToTracks| function uses the Munkres' version of the
% Hungarian algorithm to compute an assignment which minimizes the total
% cost. It returns an M x matrix containing the corresponding indices of
% assigned tracks and detections in its two columns. It also returns the
% indices of tracks and detections that remained unassigned. function [assignments, unassignedTracks, unassignedDetections] = ...
detectionToTrackAssignment() nTracks = length(tracks);
nDetections = size(centroids, ); % Compute the cost of assigning each detection to each track.
cost = zeros(nTracks, nDetections);
for i = :nTracks
cost(i, :) = distance(tracks(i).kalmanFilter, centroids);
end % Solve the assignment problem.
costOfNonAssignment = ;
[assignments, unassignedTracks, unassignedDetections] = ...
assignDetectionsToTracks(cost, costOfNonAssignment);
end %% Update Assigned Tracks
% The |updateAssignedTracks| function updates each assigned track with the
% corresponding detection. It calls the |correct| method of
% |vision.KalmanFilter| to correct the location estimate. Next, it stores
% the new bounding box, and increases the age of the track and the total
% visible count by . Finally, the function sets the invisible count to . function updateAssignedTracks()
numAssignedTracks = size(assignments, );
for i = :numAssignedTracks
trackIdx = assignments(i, );
detectionIdx = assignments(i, );
centroid = centroids(detectionIdx, :);
bbox = bboxes(detectionIdx, :); % Correct the estimate of the object's location
% using the new detection.
correct(tracks(trackIdx).kalmanFilter, centroid); % Replace predicted bounding box with detected
% bounding box.
tracks(trackIdx).bbox = bbox; % Update track's age.
tracks(trackIdx).age = tracks(trackIdx).age + ; % Update visibility.
tracks(trackIdx).totalVisibleCount = ...
tracks(trackIdx).totalVisibleCount + ;
tracks(trackIdx).consecutiveInvisibleCount = ;
end
end %% Update Unassigned Tracks
% Mark each unassigned track as invisible, and increase its age by . function updateUnassignedTracks()
for i = :length(unassignedTracks)
ind = unassignedTracks(i);
tracks(ind).age = tracks(ind).age + ;
tracks(ind).consecutiveInvisibleCount = ...
tracks(ind).consecutiveInvisibleCount + ;
end
end %% Delete Lost Tracks
% The |deleteLostTracks| function deletes tracks that have been invisible
% for too many consecutive frames. It also deletes recently created tracks
% that have been invisible for too many frames overall. function deleteLostTracks()
if isempty(tracks)
return;
end invisibleForTooLong = ;
ageThreshold = ; % Compute the fraction of the track's age for which it was visible.
ages = [tracks(:).age];
totalVisibleCounts = [tracks(:).totalVisibleCount];
visibility = totalVisibleCounts ./ ages; % Find the indices of 'lost' tracks.
lostInds = (ages < ageThreshold & visibility < 0.6) | ...
[tracks(:).consecutiveInvisibleCount] >= invisibleForTooLong; % Delete lost tracks.
tracks = tracks(~lostInds);
end %% Create New Tracks
% Create new tracks from unassigned detections. Assume that any unassigned
% detection is a start of a new track. In practice, you can use other cues
% to eliminate noisy detections, such as size, location, or appearance. function createNewTracks()
centroids = centroids(unassignedDetections, :);
bboxes = bboxes(unassignedDetections, :); for i = :size(centroids, ) centroid = centroids(i,:);
bbox = bboxes(i, :); % Create a Kalman filter object.
kalmanFilter = configureKalmanFilter('ConstantVelocity', ...
centroid, [, ], [, ], ); % Create a new track.
newTrack = struct(...
'id', nextId, ...
'bbox', bbox, ...
'kalmanFilter', kalmanFilter, ...
'age', , ...
'totalVisibleCount', , ...
'consecutiveInvisibleCount', ); % Add it to the array of tracks.
tracks(end + ) = newTrack; % Increment the next id.
nextId = nextId + ;
end
end %% Display Tracking Results
% The |displayTrackingResults| function draws a bounding box and label ID
% for each track on the video frame and the foreground mask. It then
% displays the frame and the mask in their respective video players. function displayTrackingResults()
% Convert the frame and the mask to uint8 RGB.
frame = im2uint8(frame);
mask = uint8(repmat(mask, [, , ])) .* ; minVisibleCount = ;
if ~isempty(tracks) % Noisy detections tend to result in short-lived tracks.
% Only display tracks that have been visible for more than
% a minimum number of frames.
reliableTrackInds = ...
[tracks(:).totalVisibleCount] > minVisibleCount;
reliableTracks = tracks(reliableTrackInds); % Display the objects. If an object has not been detected
% in this frame, display its predicted bounding box.
if ~isempty(reliableTracks)
% Get bounding boxes.
bboxes = cat(, reliableTracks.bbox); % Get ids.
ids = int32([reliableTracks(:).id]); % Create labels for objects indicating the ones for
% which we display the predicted rather than the actual
% location.
labels = cellstr(int2str(ids'));
predictedTrackInds = ...
[reliableTracks(:).consecutiveInvisibleCount] > ;
isPredicted = cell(size(labels));
isPredicted(predictedTrackInds) = {' predicted'};
labels = strcat(labels, isPredicted); % Draw the objects on the frame.
frame = insertObjectAnnotation(frame, 'rectangle', ...
bboxes, labels); % Draw the objects on the mask.
mask = insertObjectAnnotation(mask, 'rectangle', ...
bboxes, labels);
end
end % Display the mask and the frame.
obj.maskPlayer.step(mask);
obj.videoPlayer.step(frame);
end %% Summary
% This example created a motion-based system for detecting and
% tracking multiple moving objects. Try using a different video to see if
% you are able to detect and track objects. Try modifying the parameters
% for the detection, assignment, and deletion steps.
%
% The tracking in this example was solely based on motion with the
% assumption that all objects move in a straight line with constant speed.
% When the motion of an object significantly deviates from this model, the
% example may produce tracking errors. Notice the mistake in tracking the
% person labeled #, when he is occluded by the tree.
%
% The likelihood of tracking errors can be reduced by using a more complex
% motion model, such as constant acceleration, or by using multiple Kalman
% filters for every object. Also, you can incorporate other cues for
% associating detections over time, such as size, shape, and color. displayEndOfDemoMessage(mfilename)
end