This content is not included in
your SAE MOBILUS subscription, or you are not logged in.
Vehicle Detection Based on Deep Neural Network Combined with Radar Attention Mechanism
Technical Paper
2020-01-5171
ISSN: 0148-7191, e-ISSN: 2688-3627
This content contains downloadable datasets
Annotation ability available
Sector:
Event:
Automotive Technical Papers
Language:
English
Abstract
In the autonomous driving perception task, the accuracy of target detection is an essential evaluation, especially for small targets. In this work, we propose a multi-sensor fusion neural network that combines radar and image data to improve the confidence level of the camera when detecting targets and the accuracy of the prediction box regression. The fusion network is based on the basic structure of single-shot multi-box detection (SSD). Inspired by the attention mechanism in image processing, our work incorporates the a priori knowledge of radar detection in the convolutional block attention module (CBAM), which forms a new attention mechanism module called radar convolutional block attention module (RCBAM). We add the RCBAM into the SSD target detection network to build a deep neural network fusing millimeter-wave radar and camera. The traditional SSD network and multi-sensor fusion network in this paper are trained simultaneously by using the recently introduced nuTonomy scenes (nuScenes) dataset that contains both radar and image data. The test results demonstrate that our algorithm improves the mean average precision (mAP) from 38.3% to 43.7%. And the results obtained from the dataset and realistic scenarios show that the RCBAM-based multi-sensor fusion network in this paper performs better in detecting small targets.
Authors
Citation
Bai, J., Zhang, Y., Huang, L., and Li, S., "Vehicle Detection Based on Deep Neural Network Combined with Radar Attention Mechanism," SAE Technical Paper 2020-01-5171, 2020, https://doi.org/10.4271/2020-01-5171.Data Sets - Support Documents
Title | Description | Download |
---|---|---|
Unnamed Dataset 1 | ||
Unnamed Dataset 2 |
Also In
References
- Girshick , R. , Donahue , J. , Darrell , T. , and Malik , J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation CVPR. IEEE Columbus, OH 2014 https://doi.org/10.1109/CVPR.2014.81
- Girshick , R. Fast R-CNN 2015 IEEE International Conference on Computer Vision (ICCV) Santiago 2015 https://doi.org/10.1109/ICCV.2015.169
- Ren , S. , He , K. , Girshick , R. , and Sun , J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks IEEE Transactions on Pattern Analysis & Machine Intelligence 39 6 1137 1149 2017 https://doi.org/10.1109/TPAMI.2016.2577031
- Redmon , J. , Divvala , S. , Girshick , R. , and Farhadi , A. You Only Look Once: Unified, Real-Time Object Detection IEEE Computer Vision & Pattern Recognition Las Vegas, NV 2016 https://doi.org/10.1109/CVPR.2016.91
- Liu , W. , Anguelov , D. , Erhan , D. , Szegedy , C. et al. SSD: Single Shot Multibox Detector European conference on computer vision Springer, Cham 2016 21 37 https://doi.org/10.1007/978-3-319-46448-0_2
- Redmon , J. and Farhadi , A. YOLO9000: Better, Faster, Stronger IEEE Conference on Computer Vision & Pattern Recognition Honolulu, HI 2017 6517 6525 https://doi.org/10.1109/CVPR.2017.690
- Redmon , J. and Farhadi , A. 2018
- Bochkovskiy , A. , Wang , C.Y. , and Liao , H.Y.M. 2020
- Simonyan , K. and Zisserman , A. Very Deep Convolutional Networks for Large-Scale Image Recognition Proc. Int. Conf. Learn. Representations 2015
- Borji , A. and Itti , L. State-of-the-Art in Visual Attention Modeling IEEE Transactions on Pattern Analysis and Machine Intelligence 35 1 185 207 2013 https://doi.org/10.1109/TPAMI.2012.89
- Chikkerur , S. , Serre , T. , and Tan , C. What and Where: A Bayesian Inference Theory of Attention Vision Research 50 22 2233 2247 2010 https://doi.org/10.1016/j.visres.2010.05.013
- Woo , S. , Park , J. , and Lee , J.Y. 2018
- Caesar , H. , Bankiti , V. , Lang , A.H. , Vora , S. et al. 2019