Integrating 3D point cloud and image fusion into flying car detection systems is
essential for enhancing both safety and operational efficiency. Accurate
environmental mapping and obstacle detection enable flying cars to optimize
flight paths, mitigate collision risks, and perform effectively in diverse and
challenging conditions. The AutoAlignV2 paradigm recently introduced a learnable
schema that unifies these data formats for 3D object detection. However, the
computational expense of the dynamic attention alignment mechanism poses a
significant challenge. To address this, we propose a Lightweight Cross-modal
Feature Dynamic Aggregation Module, which utilizes a model-driven feature
alignment strategy. This module dynamically realigns heterogeneous features and
selectively emphasizes salient aspects within both point cloud and image
datasets, enhancing the differentiation between objects and the background and
improving detection accuracy. Additionally, we introduce the Lightweight
Spatial-Reduction Attention (LSRA) layer to enhance the original attention
mechanism. By employing spatial reduction and positional offset techniques, LSRA
reduces computational complexity, accelerating the aggregation of cross-modal
features while minimizing computational overhead. Furthermore, we implement a
novel dropout scheme before extracting features from 2D images, enhancing the
model's generalization capabilities and reducing computational costs. We present
a new lightweight framework—Lightweight Dynamic Feature Aggregation for
Multi-modal Fusion (LDFA)—designed specifically for the harmonious fusion of 3D
point cloud data and 2D image-derived information. The LDFA framework achieves a
meticulous balance between computational efficiency and enhanced perceptual
capabilities. Extensive experimental evaluations on the nuScenes benchmark
dataset confirm the efficacy and efficiency of the LDFA fusion strategy,
demonstrating its potential to redefine the state-of-the-art in multimodal 3D
object detection. Code will be available at
https://github.com/zishenjiucai/LDFA.