This content is not included in your SAE MOBILUS subscription, or you are not logged in.
A Bootstrap Approach to Training DNNs for the Automotive Theater
ISSN: 0148-7191, e-ISSN: 2688-3627
Published March 28, 2017 by SAE International in United States
This content contains downloadable datasetsAnnotation ability available
The proposed technique is a tailored deep neural network (DNN) training approach which uses an iterative process to support the learning of DNNs by targeting their specific misclassification and missed detections. The process begins with a DNN that is trained on freely available annotated image data, which we will refer to as the Base model, where a subset of the categories for the classifier are related to the automotive theater. A small set of video capture files taken from drives with test vehicles are selected, (based on the diversity of scenes, frequency of vehicles, incidental lighting, etc.), and the Base model is used to detect/classify images within the video files. A software application developed specifically for this work then allows for the capture of frames from the video set where the DNN has made misclassifications. The corresponding annotation files for these images are subsequently corrected to eliminate mislabels. The corrected annotations and corresponding images are then collated and used to re-train the base DNN model. The process is subsequently repeated, where the newly trained model, (which we will refer to as the Cycle 1 model), is used to review the same subset of video files. The process is repeatedly indefinitely until a satisfactory level of accuracy is achieved, where each cycle of training results in an incrementally improved model. The ultimate objective is to create a robust DNN able of correctly detecting and classifying a large range of objects pertinent to the general driving experience.
CitationSolomon, J. and Charette, F., "A Bootstrap Approach to Training DNNs for the Automotive Theater," SAE Technical Paper 2017-01-0099, 2017, https://doi.org/10.4271/2017-01-0099.
Data Sets - Support Documents
|[Unnamed Dataset 1]|
- Redmon, J., Divvala, S., Girshick, R., and Farhadi, A., “You Only Look Once: Unified, Real-time Object Detection,” Computer Vision and Pattern Recognition (CVPR), 2015.
- Sermanent, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and Lecun, Y.,”Overfeat: Integrated Recognition, Localization and Detection Using Convolutional Networks,” International Conference on Learning Representation (ICLR), 2014.
- “The PASCAL Visual Object Classes Homepage,” last modified November, 2014, http://host.robots.ox.ac.uk:8080/pascal/VOC/
- Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C-Y., Berg, A., “SSD: Single Shot Mutlibox Detector,” arXiv:1512.02325, March 30, 2016.
- Simonyan, K., Zisserman, A., “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Neural Information Processing Systems (NIPS), 2015.
- Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla A., Bernstein, M., Berg, A.C., Li, F.F., “Imagenet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision, 2015.
- Erhan, D., Szegedy, C., Toshev, A., and Anguelov, D., “Scalable Object Detection using Deep Neural Networks,” CPVR, 2014.
- Bradski, G., “OpenCV 2.4 Library”, current version posted on 1/15/2008, http://code.opencv.org/.
- Jia, Y., Shelhamer, E., Donahue, J., Karayev, S, Long, J., Girshick, R., Guadarrama, S., and Darrell, T., “Caffe: Convolutional Architecture for Fast Feature Embedding,” arXiv preprint arXiv:1408.5093, 2014.