Publications

Instance-Aware Predictive Navigation in Multi-Agent Environments

Instance-Aware Predictive Navigation in Multi-Agent Environments

ICRA 2021 A new visual model-based RL method with consideration of multiple hypotheses for future object movement.

Dense Prediction with Attentive Feature Aggregation

Dense Prediction with Attentive Feature Aggregation

arXiv 2021 We propose Attentive Feature Aggregation (AFA) to exploit both spatial and channel information for semantic segmentation and boundary detection.

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

CVPR 2020 Oral The largest driving video dataset for heterogeneous multitask learning.

Frustratingly Simple Few-Shot Object Detection

Frustratingly Simple Few-Shot Object Detection

ICML 2020 State-of-the-art few-shot detection method with backpropagation learning.

Learning Saliency Propagation for Semi-Supervised Instance Segmentation

Learning Saliency Propagation for Semi-Supervised Instance Segmentation

CVPR 2020 We propose a ShapeProp module to propagate information between object detection and segmentation supervisions for Semi-Supervised Instance Segmentation.

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

ICCV 2019 We propose a novel online framework for 3D vehicle detection and tracking from monocular videos.

Disentangling Propagation and Generation for Video Prediction

Disentangling Propagation and Generation for Video Prediction

ICCV 2019 We describe a computational model for high-fidelity video prediction which disentangles motion-specific propagation from motion-agnostic generation.

Few Shot Object Detection via Feature Reweighting

Few Shot Object Detection via Feature Reweighting

ICCV 2019 We develop a few-shot object detector that can learn to detect novel objects from only a few annotated examples.

Deep Mixture of Experts via Shallow Embedding

Deep Mixture of Experts via Shallow Embedding

UAI 2019 We explore a mixture of experts (MoE) approach to deep dynamic routing, which activates certain experts in the network on a per-example basis.

TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning

TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning

CVPR 2019 We propose Task-Aware Feature Embedding Networks (TAFE-Nets) to learn how to adapt the image representation to a new task in a meta learning fashion.

Hierarchical Discrete Distribution Decomposition for Match Density Estimation

Hierarchical Discrete Distribution Decomposition for Match Density Estimation

CVPR 2019 We propose Hierarchical Discrete Distribution Decomposition (HD^3), a framework suitable for learning probabilistic pixel correspondences in both optical flow and stereo matching.

Semantic Predictive Control for Explainable and Efficient Policy Learning

Semantic Predictive Control for Explainable and Efficient Policy Learning

ICRA 2019 We propose a driving policy learning framework that predicts feature representations of future visual inputs.

Deep Object-Centric Policies for Autonomous Driving

Deep Object-Centric Policies for Autonomous Driving

ICRA 2019 We show that object-centric models outperform object-agnostic methods in scenes with other vehicles and pedestrians.

SkipNet: Learning Dynamic Routing in Convolutional Networks

SkipNet: Learning Dynamic Routing in Convolutional Networks

ECCV 2018 We introduce SkipNet, a modified residual network, that uses a gating network to selectively skip convolutional blocks based on the activations of the previous layer.

Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation

Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation

ECCV 2018 We aim to characterize adversarial examples based on spatial context information in semantic segmentation.

IDK Cascades: Fast Deep Learning by Learning not to Overthink

IDK Cascades: Fast Deep Learning by Learning not to Overthink

UAI 2018 We introduce the “I Don’t Know” (IDK) prediction cascades framework to accelerate inference without a loss in prediction accuracy.

Deep Layer Aggregation

Deep Layer Aggregation

CVPR 2018 Oral We augment standard architectures with deeper aggregation to better fuse information across layers.

TextureGAN: Controlling Deep Image Synthesis with Texture Patches

TextureGAN: Controlling Deep Image Synthesis with Texture Patches

CVPR 2018 Spotlight We develop a local texture loss in addition to adversarial and content loss to train the generative network.

PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup

PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup

CVPR 2018 We introduce an automatic method for editing a portrait photo so that the subject appears to be wearing makeup in the style of another person in a reference photo.

Interactive 3D Modeling with a Generative Adversarial Network

Interactive 3D Modeling with a Generative Adversarial Network

3DV 2017 We propose using a generative adversarial network (GAN) to assist a novice user in designing real-world shapes with a simple interface.

Dilated Residual Networks

Dilated Residual Networks

CVPR 2017 We show that dilated residual networks (DRNs) outperform their non-dilated counterparts in image classification without increasing the model’s depth or complexity.

End-to-end Learning of Driving Models from Large-scale Video Datasets

End-to-end Learning of Driving Models from Large-scale Video Datasets

CVPR 2017 Oral We develop an end-to-end trainable architecture for learning to predict a distribution over future vehicle egomotion from instantaneous monocular camera observations and previous vehicle state.

Scribbler: Controlling Deep Image Synthesis with Sketch and Color

Scribbler: Controlling Deep Image Synthesis with Sketch and Color

CVPR 2017 We propose a deep adversarial image synthesis architecture that is conditioned on sketched boundaries and sparse color strokes to generate realistic images.

Semantic Scene Completion from a Single Depth Image

Semantic Scene Completion from a Single Depth Image

CVPR 2017 Oral Our network uses a dilation-based 3D context module to efficiently expand the receptive field and enable 3D context learning.