Abstract
Aggregating information from features across different layers is an essential operation for dense prediction models. Despite its limited expressiveness, feature concatenation dominates the choice of aggregation operations. In this paper, we introduce Attentive Feature Aggregation (AFA) to fuse different network layers with more expressive non-linear operations. AFA exploits both spatial and channel attention to compute weighted average of the layer activations. Inspired by neural volume rendering, we extend AFA with Scale-Space Rendering (SSR) to perform late fusion of multi-scale predictions. AFA is applicable to a wide range of existing network designs. Our experiments show consistent and significant improvements on challenging semantic segmentation benchmarks, including Cityscapes, BDD100K, and Mapillary Vistas, at negligible computational and parameter overhead. In particular, AFA improves the performance of the Deep Layer Aggregation (DLA) model by nearly 6% mIoU on Cityscapes. Our experimental analyses show that AFA learns to progressively refine segmentation maps and to improve boundary details, leading to new state-of-the-art results on boundary detection benchmarks on BSDS500 and NYUDv2.
Results
We show the semantic segmentation and boundary prediction results on the videos of Cityscapes, BDD100K, and NYUv2. The predictions are conducted on each individual image without considering the temporal context.
Cityscapes
Examples of running AFA-DLA-X-102 on Cityscapes for semantic segmentation.
BDD100K
Examples of running AFA-DLA-X-169 on BDD100K for semantic segmentation.
NYUDv2
Examples of running AFA-DLA-34 on NYUDv2 for boundary detection.
Paper
![]() | Yung-Hsu Yang, Thomas E. Huang, Samuel Rota Bulò, Peter Kontschieder, Fisher Yu Dense Prediction with Attentive Feature Aggregation arXiv 2021 |
Code

github.com/SysCV/dla-afa
Citation
@misc{yang2021dense,
title={Dense Prediction with Attentive Feature Aggregation},
author={Yung-Hsu Yang and Thomas E. Huang and Samuel Rota Bulò and Peter Kontschieder and Fisher Yu},
year={2021},
eprint={2111.00770},
archivePrefix={arXiv},
primaryClass={cs.CV}
}