【文件属性】:
文件名称:occlusion reasoning for multiple object tracking
文件大小:5.32MB
文件格式:PDF
更新时间:2017-09-25 02:37:48
occlusion multiple object tracking
abstract:Occlusion reasoning for visual object tracking in uncontrolled environments is a challenging
problem. It becomes significantly more difficult when dense groups of indistinguishable
objects are present in the scene that cause frequent inter-object interactions and occlusions.
We present several practical solutions that tackle the inter-object occlusions for video
surveillance applications.
In particular, this thesis proposes three methods. First, we propose "reconstructiontracking,"
an online multi-camera spatial-temporal data association method for tracking
large groups of objects imaged with low resolution. As a variant of the well-known Multiple-
Hypothesis-Tracker, our approach localizes the positions of objects in 3D space with possibly
occluded observations from multiple camera views and performs temporal data association
in 3D. Second, we develop "track linking," a class of offline batch processing algorithms
for long-term occlusions, where the decision has to be made based on the observations from
the entire tracking sequence. We construct a graph representation to characterize occlusion
events and propose an efficient graph-based/combinatorial algorithm to resolve occlusions.
Third, we propose a novel Bayesian framework where detection and data association are
combined into a single module and solved jointly. Almost all traditional tracking systems
address the detection and data association tasks separately in sequential order. Such a
design implies that the output of the detector has to be reliable in order to make the data
association work. Our framework takes advantage of the often complementary nature of the
two subproblems, which not only avoids the error propagation issue from which traditional
"detection-tracking approaches" suffer but also eschews common heuristics such as "nonmaximum
suppression" of hypotheses by modeling the likelihood of the entire image.
The thesis describes a substantial number of experiments, involving challenging, notably
distinct simulated and real data, including infrared and visible-light data sets recorded
ourselves or taken from data sets publicly available. In these videos, the number of objects
ranges from a dozen to a hundred per frame in both monocular and multiple views. The
experiments demonstrate that our approaches achieve results comparable to those of stateof-
the-art approaches.