Aiming at the problem of low detection accuracy in 3D object detection for
autonomous driving, this paper proposes an improved PointPillars framework that
enhances feature representation while reducing computational cost. Accurate
perception of surrounding vehicles, pedestrians, and obstacles is critical to
ensure the safety and reliability of autonomous driving systems, yet the widely
used PointPillars model is often constrained by limited global feature
extraction and vulnerability to environmental interference, which restricts its
effectiveness in complex real-world scenarios. To address these limitations, the
backbone network is reconstructed with a lightweight MobileViTv2 module to
strengthen global feature capture and robustness, enabling better modeling of
long-range dependencies without significantly increasing model complexity. In
addition, a dynamic upsampling strategy is introduced to replace the original
upsampling module, which not only improves detection performance but also
reduces the number of parameters and computational burden. The proposed method
is validated on a hybrid dataset composed of the public KITTI benchmark and
self-collected driving data, providing a more comprehensive evaluation under
diverse conditions. Experimental results demonstrate consistent improvements
compared with the original PointPillars, including a 3.74% increase in Average
Orientation Similarity (AOS), a 1.02% gain in BEV detection accuracy, and a
2.41% reduction in parameter count, while maintaining real-time inference at
15.2 frames per second. Furthermore, qualitative comparisons show that the
improved model exhibits significantly fewer false detections and enhanced
robustness to interference. Overall, the proposed approach contributes a
practical and efficient 3D object detection framework that achieves higher
accuracy and reliability while meeting the real-time requirements for deployment
in autonomous driving applications.