Object detection is one of the core tasks in autonomous driving perception
systems. Most perception algorithms commonly use cameras and LiDAR sensors, but
the robustness is insufficient in harsh environments such as heavy rain and fog.
Moreover, velocity of objects is crucial for identifying motion states. The next
generation of 4D millimeter-wave radar retains traditional radar advantages in
robustness and speed measurement, while also providing height information,
higher resolution and density. 4D radar has great potential in the field of 3D
object detection. However, existing methods overlook the need for specific
feature extraction modules for 4D millimeter-wave radar, which can lead to
potential information loss. In this study, we propose RadarPillarDet, a novel
approach for extracting features from 4D radar to achieve high-quality object
detection. Specifically, our method introduces a dual-stream encoder (DSE)
module, which combines traditional multilayer perceptron and attention-based
methods. The DSE module serves as a powerful point feature extractor that
enhances feature dimensions. Compared to other methods, Sum-Avg-Max Pillar
Encoding (SAMPE) module effectively enriches the features of sparse radar point
clouds by collecting various pillar features using three different encoders.
Additionally, to effectively address the issue of noise points in 4D radar, the
designed multi-pillar self-attention (MPSA) module can adaptively learn the
weights of different pillar features, thereby enhancing the quality of the 4D
radar bird's eye view (BEV) features. Experimental results on the View of Delft
(VoD) dataset show that the proposed RadarPillarDet achieves excellent detection
performance, with a performance 3.22% mAP higher than the baseline.