This content is not included in your SAE MOBILUS subscription, or you are not logged in.
Generation and Usage of Virtual Data for the Development of Perception Algorithms Using Vision
ISSN: 0148-7191, e-ISSN: 2688-3627
Published April 05, 2016 by SAE International in United States
Annotation ability available
Camera data generated in a 3D virtual environment has been used to train object detection and identification algorithms. 40 common US road traffic signs were used as the objects of interest during the investigation of these methods. Traffic signs were placed randomly alongside the road in front of a camera in a virtual driving environment, after the camera itself was randomly placed along the road at an appropriate height for a camera located on a vehicle’s rear view mirror. In order to best represent the real world, effects such as shadows, occlusions, washout/fade, skew, rotations, reflections, fog, rain, snow and varied illumination were randomly included in the generated data. Images were generated at a rate of approximately one thousand per minute, and the image data was automatically annotated with the true location of each sign within each image, to facilitate supervised learning as well as testing of the trained algorithms. A deep convolutional neural network was built using 8 hidden layers, 1.5 million free parameters, and 250,000 neurons, with unique configurations optimal for traffic sign classification. This network was then trained using the above mentioned dataset. A high cross-validation accuracy of 98% with stable k-fold validation energy was achieved. This network, trained using virtual images, was then tested on real-world images with promising results, and the network was able to consistently classify signs that appear much smaller and farther away than those in the images it was trained on. The algorithm also attempted to classify signs for which it had not been trained, and predictably classified such signs using the most similar label.
CitationNariyambut Murali, V., Micks, A., Goh, M., and Liu, D., "Generation and Usage of Virtual Data for the Development of Perception Algorithms Using Vision," SAE Technical Paper 2016-01-0170, 2016, https://doi.org/10.4271/2016-01-0170.
- Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database."Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009.
- Cireşan, Dan, et al. "Multi-column deep neural network for traffic sign classification." Neural Networks 32 (2012): 333-338.
- Haltakov, Vladimir, Unger Christian, and Ilic Slobodan. "Framework for generation of synthetic ground truth data for driver assistance applications." In Pattern Recognition, pp. 323-332. Springer Berlin Heidelberg, 2013.
- Marin, Javier, Vázquez David, Gerónimo David, and López Antonio M.. "Learning appearance in virtual scenarios for pedestrian detection." In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 137-144. IEEE, 2010.MUTCD, Texas. "Manual on Uniform Traffic Control Devices." (2006).
- Stallkamp, Johannes, et al. "The German traffic sign recognition benchmark: a multi-class classification competition." Neural Networks (IJCNN), The 2011 International Joint Conference on. IEEE, 2011.
- Mogelmose, Andreas, Liu Dongran, and Trivedi Mohan Manubhai. "Traffic sign detection for US roads: Remaining challenges and a case for tracking."Intelligent Transportation Systems (ITSC), 2014 IEEE 17th International Conference on. IEEE, 2014.
- Møgelmose, Andreas, Trivedi Mohan Manubhai, and Moeslund Thomas B.. "Vision-based traffic sign detection and analysis for intelligent driver assistance systems: Perspectives and survey." Intelligent Transportation Systems, IEEE Transactions on 13.4 (2012): 1484-1497.
- Dollár, Piotr. “Piotr's Computer Vision Matlab Toolbox (PMT).” http://vision.ucsd.edu/~pdollar/toolbox/doc/index.html
- Matas, Jiri, et al. "Robust wide-baseline stereo from maximally stable extremal regions." Image and vision computing 22.10 (2004): 761-767.