A Unified-BEV Network for Joint 3D Object Detection and Map Segmentation in Complex Traffic Scenario

2025-01-7202

03/19/2025

Features
Event
2024 International Conference on Smart Transportation Interdisciplinary Studies
Authors Abstract
Content
Recently, the multi-view image-based Bird’s Eye View (BEV) perception for autonomous driving has gained considerable attention due to its cost-effectiveness and capacity for rich semantic information. However, the majority of existing studies focus primarily on improving the performance of single task, neglect to utilize the dense and robust BEV representation that is beneficial for various downstream tasks such as 3D object detection, semantic map segmentation. These approaches inherently add extra computational burden due to repeated feature extraction and propagation for different tasks. To this end, we develop a network that simultaneously performs 3D object detection and map segmentation in a unified BEV representation space with multi-camera perspective view (PV) image inputs. Firstly, a shared network includes image feature extractor and PV-BEV transformation is employed to generate a unified BEV feature. The BEV feature serves as the input for the decoders of various tasks. Additionally, a temporal encoder and a perspective supervision head are employed in the model to enhance the performance for specific tasks. Finally, specific task decoders utilize the unified BEV representation to predict dynamic or static objects and semantic map surround ego car. Comprehensive experiments are conducted on the nuScenes dataset and the results demonstrate that our multi-task framework outperforms existing state-of-the-art approaches on 3D object detection and semantic map construction.
Meta TagsDetails
DOI
https://doi.org/10.4271/2025-01-7202
Pages
9
Citation
Li, M., Song, T., Xu, Y., Zhou, Z. et al., "A Unified-BEV Network for Joint 3D Object Detection and Map Segmentation in Complex Traffic Scenario," SAE Technical Paper 2025-01-7202, 2025, https://doi.org/10.4271/2025-01-7202.
Additional Details
Publisher
Published
Mar 19
Product Code
2025-01-7202
Content Type
Technical Paper
Language
English