Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Hou-Ning Hu, Qizhi Cai, Dequan Wang, Ji Lin, Min Sun, Philipp Krähenbühl, Trevor Darrell, Fisher Yu
ICCV 2019

Abstract

Vehicle 3D extents and trajectories are critical cues for predicting the future location of vehicles and planning future agent ego-motion based on those predictions. In this paper, we propose a novel online framework for 3D vehicle detection and tracking from monocular videos. The framework can not only associate detections of vehicles in motion over time, but also estimate their complete 3D bounding box information from a sequence of 2D images captured on a moving platform. Our method leverages 3D box depth-ordering matching for robust instance association and utilizes 3D trajectory prediction for re-identification of occluded vehicles. We also design a motion learning module based on an LSTM for more accurate long-term motion extrapolation. Our experiments on a simulation dataset and the KITTI tracking dataset show that our 3D tracking pipeline offers robust data association and tracking.

Video

Paper

Hou-Ning Hu, Qizhi Cai, Dequan Wang, Ji Lin, Min Sun, Philipp Krähenbühl, Trevor Darrell, Fisher Yu
Joint Monocular 3D Vehicle Detection and Tracking
ICCV 2019

Code

paper

github.com/ucbdrive/3d-vehicle-tracking

Citation

@inproceedings{Hu3DT19,
  author = {Hu, Hou-Ning and Cai, Qi-Zhi and Wang, Dequan
  and Lin, Ji and Sun, Min and Krähenbühl, Philipp and
  Darrell, Trevor and Yu, Fisher},
  title = {Joint Monocular 3D Vehicle Detection and Tracking},
  journal = {ICCV},
  year = {2019}
}

Related

OVTrack: Open-Vocabulary Multiple Object Tracking

OVTrack: Open-Vocabulary Multiple Object Tracking

CVPR 2023 We introduce the first open-vocabulary multiple object tracker OVTrack trained from only static images and an evaluation benchmark.

CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion

CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion

CoRL 2022 We propose a method for panoramic 3D object tracking, called CC-3DT, that associates and models object trajectories both temporally and across views.

Tracking Every Thing in the Wild

Tracking Every Thing in the Wild

ECCV 2022 We introduce a new metric, Track Every Thing Accuracy (TETA), and a Track Every Thing tracker (TETer), which performs association using Class Exemplar Matching (CEM).

Video Mask Transfiner for High-Quality Video Instance Segmentation

Video Mask Transfiner for High-Quality Video Instance Segmentation

ECCV 2022 We propose Video Mask Transfiner (VMT) method, capable of leveraging fine-grained high-resolution features thanks to a highly efficient video transformer structure.

Video Mask Transfiner for High-Quality Video Instance Segmentation

Video Mask Transfiner for High-Quality Video Instance Segmentation

ECCV 2022 We introduce the HQ-YTVIS dataset as long as Tube-Boundary AP, which provides training, validation and testing support to facilitate future development of VIS methods aiming at higher mask quality.

SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation

SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation

CVPR 2022 We introduce the largest synthetic dataset for autonomous driving to study continuous domain adaptation and multi-task perception.

Transforming Model Prediction for Tracking

Transforming Model Prediction for Tracking

CVPR 2022 We propose a tracker architecture employing a Transformer-based model prediction module.

Monocular Quasi-Dense 3D Object Tracking

Monocular Quasi-Dense 3D Object Tracking

TPAMI 2022 We combine quasi-dense tracking on 2D images and motion prediction in 3D space to achieve significant advance in 3D object tracking from monocular videos.

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation

NeurIPS 2021 Spotlight We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation.

Quasi-Dense Similarity Learning for Multiple Object Tracking

Quasi-Dense Similarity Learning for Multiple Object Tracking

CVPR 2021 Oral We propose a simple yet effective multi-object tracking method in this paper.