This content is not included in
your SAE MOBILUS subscription, or you are not logged in.
CMM: LiDAR-Visual Fusion with Cross-Modality Module for Large-Scale Place Recognition
Technical Paper
2023-01-7039
ISSN: 0148-7191, e-ISSN: 2688-3627
Annotation ability available
Sector:
Language:
English
Abstract
LiDAR and camera fusion have emerged as a promising approach for improving place
recognition in robotics and autonomous vehicles. However, most existing
approaches often treat sensors separately, overlooking the potential benefits of
correlation between them. In this paper, we propose a Cross-
Modality Module (CMM) to leverage the potential
correlation of LiDAR and camera features for place recognition. Besides, to
fully exploit potential of each modality, we propose a Local-Global Fusion
Module to supplement global coarse-grained features with local fine-grained
features. The experiment results on public datasets demonstrate that our
approach effectively improves the average recall by 2.3%, reaching 98.7%,
compared with simply stacking of LiDAR and camera.
Authors
Citation
Xue, S., Li, B., Lu, F., Liu, Z. et al., "CMM: LiDAR-Visual Fusion with Cross-Modality Module for Large-Scale Place Recognition," SAE Technical Paper 2023-01-7039, 2023.Also In
References
- Sánchez , J. , Perronnin , F. , Mensink , T. , and Verbeek , J. Image Classification with the Fisher Vector: Theory and Practice Int J Comput Vis 105 3 2013 222 245 10.1007/s11263-013-0636-x
- Jegou , H. , Douze , M. , Schmid , C. , and Perez , P. Aggregating Local Descriptors Into a Compact Image Representation 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition San Francisco, CA IEEE 2010 3304 3311 10.1109/CVPR.2010.5540039
- Jegou , H. , Perronnin , F. , Douze , M. , Sanchez , J. et al. Aggregating Local Image Descriptors into Compact Codes IEEE Trans. Pattern Anal. Mach. Intell. 34 9 2012 1704 1716 10.1109/TPAMI.2011.235
- Lowe , D.G. Distinctive Image Features from Scale-Invariant Keypoints International Journal of Computer Vision 60 2 2004 91 110 10.1023/B:VISI.0000029664.99615.94
- Bay , H. , Tuytelaars , T. , and Van Gool , L. SURF: Speeded Up Robust Features Leonardis , A. , Bischof , H. , and Pinz , A. Computer Vision – ECCV 2006 Berlin, Heidelberg Springer 2006 404 417 10.1007/11744023_32
- Liu , S. and Deng , W. Very Deep Convolutional Neural Network Based Image Classification Using Small Training Sample Size 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) 730 734 2015 10.1109/ACPR.2015.7486599
- He , K. , Zhang , X. , Ren , S. , and Sun , J. Deep Residual Learning for Image Recognition 2015
- Arandjelovic , R. , Gronat , P. , Torii , A. , Pajdla , T. et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence 2017 1 10.1109/TPAMI.2017.2711011
- Hausler , S. , Garg , S. , Xu , M. , Milford , M. et al. Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Nashville, TN IEEE 2021 14136 14147 10.1109/CVPR46437.2021.01392
- Chen , Z. , Liu , L. , Sa , I. , Ge , Z. et al. Learning Context Flexible Attention Model for Long-Term Visual Place Recognition IEEE Robotics and Automation Letters 3 2018 1 1 10.1109/LRA.2018.2859916
- Khaliq , A. , Ehsan , S. , Milford , M. , and McDonald-Maier , K. 2020
- Johnson , A.E. and Hebert , M. Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes IEEE Transactions on Pattern Analysis and Machine Intelligence 21 5 1999 433 449 10.1109/34.765655
- Rusu , R.B. , Blodow , N. , Marton , Z.C. , and Beetz , M. Aligning Point Cloud Views Using Persistent Feature Histograms 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems 3384 3391 2008 10.1109/IROS.2008.4650967
- Magnusson , M. , Andreasson , H. , Nuchter , A. , and Lilienthal , A.J. Appearance-Based Loop Detection from 3D Laser Data Using the Normal Distributions Transform 2009 IEEE International Conference on Robotics and Automation Kobe IEEE 2009 23 28 10.1109/ROBOT.2009.5152712
- Steder , B. , Grisetti , G. , and Burgard , W. Robust Place Recognition for 3D Range Data Based on Point Features 2010 IEEE International Conference on Robotics and Automation 1400 1405 2010 10.1109/ROBOT.2010.5509401
- Bosse , M. and Zlot , R. Place recognition using keypoint voting in large 3D lidar datasets 2013 IEEE International Conference on Robotics and Automation 2677 2684 2013 10.1109/ICRA.2013.6630945
- Salti , S. , Tombari , F. , and Di Stefano , L. SHOT: Unique Signatures of Histograms for Surface and Texture Description Computer Vision and Image Understanding 125 2014 251 264 10.1016/j.cviu.2014.04.011
- Rusu , R.B. , Blodow , N. , and Beetz , M. Fast Point Feature Histograms (FPFH) for 3D Registration 2009 IEEE International Conference on Robotics and Automation 3212 3217 2009 10.1109/ROBOT.2009.5152473
- He , L. , Wang , X. , and Zhang , H. M2DP: A Novel 3D Point Cloud Descriptor and Its Application in Loop Closure Detection 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 231 237 2016 10.1109/IROS.2016.7759060
- Kim , G. and Kim , A. Scan Context: Egocentric Spatial Descriptor for Place Recognition Within 3D Point Cloud Map 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 4802 4809 2018 10.1109/IROS.2018.8593953
- Uy , M.A. and Lee , G.H. PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Salt Lake City, UT IEEE 2018 4470 4479 10.1109/CVPR.2018.00470
- Zhang , W. and Xiao , C. PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Long Beach, CA IEEE 2018 12428 12437 10.1109/CVPR.2019.01272
- Liu , Z. , Zhou , S. , Suo , C. , Yin , P. et al. LPD-Net: 3D Point Cloud Learning for Large-Scale Place Recognition and Environment Analysis 2019 IEEE/CVF International Conference on Computer Vision (ICCV) Seoul, Korea (South) IEEE 2019 2831 2840 10.1109/ICCV.2019.00292
- Xia , Y. , Xu , Y. , Li , S. , Wang , R. et al. SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud based Place Recognition 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Nashville, TN IEEE 2021 11343 11352 10.1109/CVPR46437.2021.01119
- Komorowski , J. MinkLoc3D: Point Cloud Based Large-Scale Place Recognition 2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Waikoloa, HI IEEE 2021 1789 1798 10.1109/WACV48630.2021.00183
- Żywanowski , K. , Banaszczyk , A. , Nowicki , M.R. , and Komorowski , J. MinkLoc3D-SI: 3D LiDAR Place Recognition With Sparse Convolutions, Spherical Coordinates, and Intensity IEEE Robotics and Automation Letters 7 2 2022 1079 1086 10.1109/LRA.2021.3136863
- Hui , L. , Yang , H. , Cheng , M. , Xie , J. et al. Pyramid Point Cloud Transformer for Large-Scale Place Recognition 2021 IEEE/CVF International Conference on Computer Vision (ICCV) Montreal, QC, Canada IEEE 2021 6078 6087 10.1109/ICCV48922.2021.00604
- Hou , Z. , Yan , Y. , Xu , C. , and Kong , H. HiTPR: Hierarchical Transformer for Place Recognition in Point Cloud 2022 International Conference on Robotics and Automation (ICRA) 2612 2618 2022 10.1109/ICRA46639.2022.9811737
- Ma , J. , Zhang , J. , Xu , J. , Ai , R. et al. Overlap Transformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition IEEE Robotics and Automation Letters 7 3 2022 6958 6965 10.1109/LRA.2022.3178797
- Vaswani , A. , Shazeer , N. , Parmar , N. , Uszkoreit , J. et al. Attention is All you Need Advances in Neural Information Processing Systems Curran Associates, Inc. 2017
- Feng , D. , Haase-Schütz , C. , Rosenbaum , L. , Hertlein , H. et al. Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges IEEE Transactions on Intelligent Transportation Systems 22 3 2021 1341 1360 10.1109/TITS.2020.2972974
- Bernreiter , L. , Ott , L. , Nieto , J. , Siegwart , R. et al. Spherical Multi-Modal Place Recognition for Heterogeneous Sensor Systems 2021 IEEE International Conference on Robotics and Automation (ICRA) 1743 1750 2021 10.1109/ICRA48506.2021.9561078
- Ratz , S. , Dymczyk , M. , Siegwart , R. , and Dubé , R. OneShot Global Localization: Instant LiDAR-Visual Pose Estimation 2020 IEEE International Conference on Robotics and Automation (ICRA) 5415 5421 2020 10.1109/ICRA40945.2020.9197458
- Oertel , A. , Cieslewski , T. , and Scaramuzza , D. Augmenting Visual Place Recognition With Structural Cues IEEE Robotics and Automation Letters 5 4 2020 5534 5541 10.1109/LRA.2020.3009077
- Xie , S. , Pan , C. , Peng , Y. , Liu , K. et al. Large-Scale Place Recognition Based on Camera-LiDAR Fused Descriptor Sensors 20 10 2020 2870 10.3390/s20102870
- Lu , Y. , Yang , F. , Chen , F. , and Xie , D. 2020 10.48550/arXiv.2008.00658
- Komorowski , J. , Wysoczańska , M. , and Trzcinski , T. MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition 2021 International Joint Conference on Neural Networks (IJCNN) 1 8 2021 10.1109/IJCNN52387.2021.9533373
- Pan , Y. , Xu , X. , Li , W. , Cui , Y. et al. CORAL: Colored Structural Representation for Bi-Modal Place Recognition 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Prague, Czech Republic IEEE 2021 2084 2091 10.1109/IROS51168.2021.9635839
- Lai , H. , Yin , P. , and Scherer , S. AdaFusion: Visual-LiDAR Fusion With Adaptive Weights for Place Recognition IEEE Robotics and Automation Letters 7 4 2022 12038 12045 10.1109/LRA.2022.3210880
- Wang , Q. , Wu , B. , Zhu , P. , Li , P. et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Seattle, WA IEEE 2020 11531 11539 10.1109/CVPR42600.2020.01155
- Zhao , G. , Sun , X. , Xu , J. , Zhang , Z. et al. 2019
- Chollet , F. Xception: Deep Learning with Depthwise Separable Convolutions 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Honolulu, HI IEEE 2017 1800 1807 10.1109/CVPR.2017.195
- Lu , J. , Hu , J. , and Zhou , J. Deep Metric Learning for Visual Understanding: An Overview of Recent Advances IEEE Signal Processing Magazine 34 6 2017 76 84 10.1109/MSP.2017.2732900
- Hermans , A. , Beyer , L. , and Leibe , B. 2017 10.48550/arXiv.1703.07737
- Maddern , W. , Pascoe , G. , Linegar , C. , and Newman , P. 1 Year, 1000 km: The Oxford RobotCar dataset The International Journal of Robotics Research 36 1 2017 3 15 10.1177/0278364916679498
- Geiger , A. , Lenz , P. , Stiller , C. , and Urtasun , R. Vision Meets Robotics: The KITTI Dataset The International Journal of Robotics Research 32 11 2013 1231 1237 10.1177/0278364913491297
- Tolias , G. , Sicre , R. , and Jégou , H. 2016 10.48550/arXiv.1511.05879
- Yandex , A.B. and Lempitsky , V. Aggregating Local Deep Features for Image Retrieval 2015 IEEE International Conference on Computer Vision (ICCV) Santiago, Chile IEEE 2015 1269 1277 10.1109/ICCV.2015.150
- Radenović , F. , Tolias , G. , and Chum , O. Fine-Tuning CNN Image Retrieval with No Human Annotation IEEE Transactions on Pattern Analysis and Machine Intelligence 41 7 2019 1655 1668 10.1109/TPAMI.2018.2846566