A Framework for Robust Driver Gaze Classification
- Technical Paper
- ISSN 0148-7191
- DOI: https://doi.org/10.4271/2016-01-1426
Published April 5, 2016 by SAE International in United States
The challenge of developing a robust, real-time driver gaze classification system is that it has to handle difficult edge cases that arise in real-world driving conditions: extreme lighting variations, eyeglass reflections, sunglasses and other occlusions. We propose a single-camera end-toend framework for classifying driver gaze into a discrete set of regions. This framework includes data collection, semi-automated annotation, offline classifier training, and an online real-time image processing pipeline that classifies the gaze region of the driver. We evaluate an implementation of each component on various subsets of a large onroad dataset. The key insight of our work is that robust driver gaze classification in real-world conditions is best approached by leveraging the power of supervised learning to generalize over the edge cases present in large annotated on-road datasets.
CitationFridman, L., Lee, J., Reimer, B., and Mehler, B., "A Framework for Robust Driver Gaze Classification," SAE Technical Paper 2016-01-1426, 2016, https://doi.org/10.4271/2016-01-1426.
- Klauer Sheila G, Dingus Thomas A, Neale Vicki L, Sudweeks Jeremy D, and Ramsey David J. The impact of driver inattention on near-crash/crash risk: An analysis using the 100-car naturalistic driving study data. Technical report, National Highway Traffic Safety Administration, 2006.
- Liang Yulan, Lee John D, and Yekhshatyan Lora. How dangerous is looking away from the road? algorithms predict crash risk from glance patterns in naturalistic driving. Human Factors: The Journal of the Human Factors and Ergonomics Society, 54(6):1104-1116, 2012.
- Senders John W, Kristofferson AB, Levison WH, Dietrich CW, and Ward JL. The attentional demand of automobile driving. Highway research record, (195), 1967.
- National Highway Traffic Safety Administration et al. Visual-manual nhtsa driver distraction guidelines for in-vehicle electronic devices. Washington, DC: National Highway Traffic Safety Administration (NHTSA), Department of Transportation (DOT), 2012.
- Driver Focus-Telematics Working Group et al. Statement of principles, criteria and verification procedures on driver interactions with advanced in-vehicle information and communication systems. Alliance of Automotive Manufacturers, 2006.
- Coughlin Joseph F, Reimer Bryan, and Mehler Bruce. Monitoring, managing, and motivating driver safety and well-being. IEEE Pervasive Computing, 10(3), 2011.
- Ishikawa Takahiro, Baker Simon, Matthews Iain, and Kanade Takeo. Passive driver gaze tracking with active appearance models. Technical report, Carnegie Mellon University, 2004.
- Zhang Xucong, Sugano Yusuke, Fritz Mario, and Bulling Andreas. Appearance-based gaze estimation in the wild. arXiv preprint arXiv:1504.02863, 2015.
- Gaur Rohit P and Jariwala Krupa N. A survey on methods and models of eye tracking, head pose and gaze estimation. In Journal of Emerging Technologies and Innovative Research, volume 1. JETIR, 2014.
- Sireesha MV, Vijaya PA, and Chellamma K. A survey on gaze estimation techniques. In Proceedings of International Conference on VLSI, Communication, Advanced Devices, Signals & Systems and Networking (VCASAN-2013), pages 353-361. Springer, 2013.
- Kazemi Vahid and Sullivan Josephine. One millisecond face alignment with an ensemble of regression trees. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1867-1874. IEEE, 2014.
- Murphy-Chutorian Erik and Trivedi Mohan M. Head pose estimation in computer vision: A survey. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(4):607-626, 2009.
- Fridman Lex, Langhans Philipp, Lee Joonbum, and Reimer Bryan. Driver gaze estimation without using eye movement. IEEE Intelligent Systems, page Submitted, 2015.
- AMER Al-Rahayfeh and MIAD Faezipour. Eye tracking and head movement detection: A state-of-art survey. Translational Engineering in Health and Medicine, IEEE Journal of, 1:2100212-2100212, 2013.
- Asadifard Mansour and Shanbezadeh Jamshid. Automatic adaptive center of pupil detection using face detection and cdf analysis. In Proceedings of the International MultiConference of Engineers and Computer Scientists, volume 1, page 3, 2010.
- Lee Joonbum, Muñoz Mauricio, Fridman Lex, Victor Trent, Reimer Bryan, and Mehler Bruce. Investigating drivers’ head and glance correspondence. Transportation Research Part F: Traffic Psychology and Behaviour, page Submitted, 2015.
- Fridman Lex, Lee Joonbum, Reimer Bryan, and Victor Trent. ”owl” and ”lizard”: Patterns of head pose and eye pose in driver gaze classification. IET Computer Vision, page Submitted, 2015.
- Yuen Jenny, Russell Bryan, Liu Ce, and Torralba Antonio. Labelme video: Building a video database with human annotations. In Computer Vision, 2009 IEEE 12th International Conference on, pages 1451-1458. IEEE, 2009.
- Ali Khaleda, Hasler David, and Fleuret Francois. Flowboostappearance learning from sparsely annotated video. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1433-1440. IEEE, 2011.
- Kavasidis Isaak, Palazzo Simone, Salvo Roberto Di, Giordano Daniela, and Spampinato Concetto. A semi-automatic tool for detection and tracking ground truth generation in videos. In Proceedings of the 1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications, page 6. ACM, 2012.
- Hoogs Anthony, Rittscher Jens, Stein Gees, and Schmied John. Video content annotation using visual analysis and a large semantic knowledgebase. In Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, volume 2, pages II-327. IEEE, 2003.
- Ivasic-Kos Marina, Ipsic Ivo, and Ribaric Slobodan. A knowledge-based multi-layered image annotation system. Expert Systems with Applications, 2015.
- Bianco Simone, Ciocca Gianluigi, Napoletano Paolo, and Schettini Raimondo. An interactive tool for manual, semi-automatic and automatic video annotation. Computer Vision and Image Understanding, 131:88-99, 2015.
- Fridman Lex and Reimer Bryan. Semi-automated annotation of discrete states in large video datasets. page Under Review, 2016.
- Rabiner Lawrence R and Juang Biing-Hwang. An introduction to hidden markov models. ASSP Magazine, IEEE, 3(1):4-16, 1986.
- Shinghal Rajjan and Toussaint Godfried T. The sensitivity of the modified viterbi algorithm to the source statistics. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (2):181-185, 1980.
- King Davis E.. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10:1755-1758, 2009.
- Lienhart Rainer and Maydt Jochen. An extended set of haar-like features for rapid object detection. In Image Processing. 2002. Proceedings. 2002 International Conference on, volume 1, pages I-900. IEEE, 2002.
- Wagner Andrew, Wright John, Ganesh Arvind, Zhou Zihan, Mobahi Hossein, and Ma Yi. Toward a practical face recognition system: Robust alignment and illumination by sparse representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 34(2):372-386, 2012.
- Sagonas Christos, Tzimiropoulos Georgios, Zafeiriou Stefanos, and Pantic Maja. 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on, pages 397-403. IEEE, 2013.
- Schweighofer Gerald and Pinz Axel. Globally optimal o (n) solution to the pnp problem for general camera models. In BMVC, pages 1-10, 2008.
- Bradski Gary and Kaehler Adrian. Learning OpenCV: Computer vision with the OpenCV library. ” O’Reilly Media, Inc.”, 2008.
- Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., and Duchesnay E.. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830, 2011.
- Mehler Bruce, Kidd David, Reimer Bryan, Reagan Ian, Dobres Jonathan, and McCartt Anne. Multimodal assessment of on-road demand of voice and manual phone calling and voice navigation entry across two embedded vehicle systems. Ergonomics, 2015.
- Batista Gustavo EAPA, Prati Ronaldo C, and Monard Maria Carolina. A study of the behavior of several methods for balancing machine learning training data. ACM Sigkdd Explorations Newsletter, 6(1):20-29, 2004.