Abstract
Many hand-held or mixed reality devices are used with a single sensor for 3D reconstruction, although they often comprise multiple sensors. Multi-sensor depth fusion is able to substantially improve the robustness and accuracy of 3D reconstruction methods, but existing techniques are not robust enough to handle sensors which operate with diverse value ranges as well as noise and outlier statistics. To this end, we introduce SenFuNet, a depth fusion approach that learns sensor-specific noise and outlier statistics and combines the data streams of depth frames from different sensors in an online fashion. Our method fuses multi-sensor depth streams regardless of time synchronization and calibration and generalizes well with little training data. We conduct experiments with various sensor combinations on the real-world CoRBS and Scene3D datasets, as well as the Replica dataset. Experiments demonstrate that our fusion strategy outperforms traditional and recent online depth fusion approaches. In addition, the combination of multiple sensors yields more robust outlier handling and precise surface reconstruction than the use of a single sensor.
Visualizations
Paper
![]() | Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc Van Gool Learning Online Multi-Sensor Depth Fusion ECCV 2022 |
Code

github.com/tfy14esa/SenFuNet
Citation
@inproceedings{Sandstrom2022,
title = {Learning Online Multi-Sensor Depth Fusion},
author = {Sandström, Erik and Oswald, Martin R. and Kumar, Suryansh and Weder, Silvan and Yu, Fisher and Sminchisescu, Cristian and Van Gool, Luc},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2022}
}