While numerous algorithms have been proposed for object tracking with demonstrated success, it remains a challenging problem for a tracker to handle large appearance change due to factors such as scale, motion, shape deformation, and occlusion. One of the main reasons is the lack of effective image representation to account for appearance variation. Most trackers use high-level appearance structure or low-level cues for representing and matching target objects. In this paper, we propose a tracking method from the perspective of mid-level vision with structural information captured in superpixels. We present a discriminative appearance model based on superpixels, thereby facilitating a tracker to distinguish the target and the background with mid-level cues. The tracking task is then formulated by computing a target-background confidence map, and obtaining the best candidate by maximum a posterior estimate. Experimental results demonstrate that our tracker is able to handle heavy occlusion and recover from drifts. In conjunction with online update, the proposed algorithm is shown to perform favorably against existing methods for object tracking. Furthermore, the proposed algorithm facilitates foreground and background segmentation during tracking.
Comparisons with other trackers
- Robust Superpixel Tracking
Fan Yang, Huchuan Lu and Ming-Hsuan Yang
IEEE Transactions on Image Processing (TIP), vol. 23, no. 4, pp. 1639-1651, 2014. [paper][supplementary material]
Shu Wang, Huchuan Lu, Fan Yang and Ming-Hsuan Yang
In Proceedings of the 13th International Conference on Computer Vision (ICCV2011), pp. 1323-1330, 2011. [paper]
Other state-of-the-art trackers
Note: only the resutls of the best 4 trackers on each sequence are shown for clarity.
|box sequence from PROST
||board sequence from PROST
Code and DatasetsThe MATLAB implementation can be downloaded from here (version 2.4, both Windows and Linux supported). Please see README for more details.
The sequences from our dataset are available here with groundtruth (note that we use a resized version of singer1). Other sequences can be found from the PROST, VTD, FRAG and PDAT datasets.