Deep Object-Centric Policies for Autonomous Driving

Dequan Wang, Coline Devin, Qi-Zhi Cai, Fisher Yu, Trevor Darrell
ICRA 2019

Deep Object-Centric Policies for Autonomous Driving

Abstract

While learning visuomotor skills in an end-to-end manner is appealing, deep neural networks are often uninterpretable and fail in surprising way s. For robotics tasks, such as autonomous driving, models that explicitly represent objects may be more robust to new scenes and provide intuitive visualizations. We describe a taxonomy of “object-centric” models which leverage both object instances and end-to-end learning. In the Grand Theft Auto V simulator, we show that object-centric models outperform object-agnostic methods in scenes with other vehicles and pedestrians, even with an imperfect detector. We also demonstrate that our architectures perform well on real-world environments by evaluating on the Berkeley DeepDrive Video dataset, where an object-centric model outperforms object-agnostic models in the low-data regimes.

Paper

Citation

@inproceedings{wang2018deep,
  title={Deep Object Centric Policies for Autonomous Driving},
  author={Wang, Dequan and Devin, Coline and Cai, Qi-Zhi and Yu, Fisher and Darrell, Trevor},
  booktitle={ICRA},
  year={2019}
}

Related


End-to-end Learning of Driving Models from Large-scale Video Datasets

End-to-end Learning of Driving Models from Large-scale Video Datasets

CVPR 2017 Oral We develop an end-to-end trainable architecture for learning to predict a distribution over future vehicle egomotion from instantaneous monocular camera observations and previous vehicle state.


Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

ICCV 2023 VTD is a promising new direction for exploring the unification of perception tasks in autonomous driving.


End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

ICCV 2021 We demonstrated that an RL coach (Roach) would be a better choice to supervise imitation learning agents.


Instance-Aware Predictive Navigation in Multi-Agent Environments

Instance-Aware Predictive Navigation in Multi-Agent Environments

ICRA 2021 A new visual model-based RL method with consideration of multiple hypotheses for future object movement.


Semantic Predictive Control for Explainable and Efficient Policy Learning

Semantic Predictive Control for Explainable and Efficient Policy Learning

ICRA 2019 We propose a driving policy learning framework that predicts feature representations of future visual inputs.