Deep 6-DOF Tracking

We present a temporal 6-DOF tracking method which leverages deep learning to achieve state-of-the-art performance on challenging datasets of real world capture. Our method is both more accurate, and more robust to occlusions than the existing best performing approaches while maintaining real-time performance. To assess its efficacy, we evaluate our approach on several challenging RGBD sequences of real objects in a variety of conditions. Notably, we systematically evaluate robustness to occlusions through a series of sequences where the object to be tracked is increasingly occluded. Finally, our approach is purely data-driven and does not require any hand-designed features: robust tracking is automatically learned from data.


Mathieu Garon and Jean-Fran├žois Lalonde
Deep 6-DOF Tracking
IEEE Transactions on Visualization and Computer Graphics, 23(11), November 2017
[arXiv:1703.09771 pre-print] [BibTeX]


Other results

After publication of the TVCG paper, we have tried replacing the convolution layers of the network with the SqueezeNet architecture. Doing so results in much greater robustness to occlusions at the expense of being slightly less stable, as demonstrated by the following occlusions graphs:


The authors gratefully acknowledge the following funding sources:

  • FRQ-NT New Researcher Grant 2016NC189939
  • NSERC Discovery Grant RGPIN-2014-05314
  • NVIDIA Corporation with the donation of the Tesla K40 and Titan X GPUs used for this research.
  • REPARTI Strategic Network

Ulaval logo