This content is not included in
your SAE MOBILUS subscription, or you are not logged in.
Visual Speech Interface: Apparatus and Algorithms
Technical Paper
1999-01-5510
ISSN: 0148-7191, e-ISSN: 2688-3627
Annotation ability available
Sector:
Language:
English
Abstract
To make speech recognition a viable input modality in the cockpit, we propose to include visual speech input to improve robustness of the approach in the presence of noise. The visual speech interface includes a headmounted lip imaging apparatus and algorithms to recognize spoken words visually. Our algorithms are based on a few components which address all issues related to lip localization, lip shape model extraction, tracking, feature extraction and recognition. We demonstrate the practicability of the concept with a visual speech recognizer for a discrete-word recognition task that is relatively simple but achievable in real time.
Authors
Citation
Chan, M., "Visual Speech Interface: Apparatus and Algorithms," SAE Technical Paper 1999-01-5510, 1999, https://doi.org/10.4271/1999-01-5510.Also In
References
- Petajan, E. D. Bischoff, B. Bodoff, D. An Improved Automatic Lipreading System to Enhance Speech Recognition in ACM SIGCHI-88 19 25 1988
- Bregler C. Konig, Y. ‘Eigenlips’ for Robust Speech Recognition Proc. International Conference on Acoustics Speech and Signal Processing 669 672 1994
- Hennecke, M. E. Stork, D. G. Prasad, K. V. Visionary Speech: Looking Ahead to Practical Speechreading Systems Hennecke, M. E. Stork, D. G. Speechreading by Humans and Machines: Models Systems and Applications 1995
- Goldschen, A. J. Continuous Automatic Speech Recognition by Lipreading Ph.D. Dissertation George Washington University Washington, D.C 1993
- Potamianos G. Graf, H. P. Discriminative Training of HMM Stream Exponents for Audio-Visual Speech Recognition Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing 3733 3736 1998
- Chan, M. T. Zhang, Y. Huang, T. S. Real-Time Lip Tracking and Bimodal Continuous Speech Recognition Proc. IEEE Signal Processing Society 1998 Workshop on Multimedia Signal Processing 65 70 1998
- Yang G. Huang, T. S. Human Face Detection in a Complex Background Pattern Recognition 27 53 63 1994
- Yang M. Ahuja, N. Detecting Human Faces in Color Images Proc. IEEE 1998 International Conference on Image Processing I 127 130 1998
- Lievin M. Luthon, F. Lip Features Automatic Extraction Proc. IEEE 1998 International Conference on Image Processing III 127 130 1998
- Chan, M. T. Automatic Lip Model Extraction for Constrained Contour-Based Tracking IEEE International Conference on Image Processing Kobe, Japan Oct 1999
- Dempster, A. P. Laird, N. M. Rubin, D. B. Maximum Likelihood from Incomplete Data via the EM Algorithm Journal of the Royal Statistical Society Series 39 1 38 1977
- Duda R. O. Hart, P. E. Pattern Classification and Scene Analysis Wiley New York 1973
- McLachlan, On Bootstrapping the Likelihood Ratio Test Statistic for the Number of components in a Normal Mixture Journal of the Royal Statistical Society Series C 36 318 324 1987
- Terzopoulos D. Metaxas, D. Dynamic 3D Models with Local and Global Deformation: Deformable Superquadrics IEEE Transactions on Pattern Analysis and Machine Intelligence 13 703 714 1991
- Kass, M. Witkin, A. Terzopoulus, D. Snakes: Active Contour Models International Journal of Computer Vision 1 321 331 1987
- Yuille, A. L. Hallinan, P. Cohen, D. S. Feature Extraction from Faces Using Deformable Templates International Journal of Computer Vision 1 99 112 1992
- Kaucic R. Blake, A. Accurate, Real-Time Unadorned Lip Tracking, in Proc 6th International Conference on Computer Vision 370 375 1998
- Rabiner R. Juang, B.-H. Fundamentals of Speech Recognition Prentice-Hall New Jersey 1993
- Young, S. Odell, J. Ollason, D. Valtchev, V. Woodland. P. The HTK-Hidden Markov Model Toolkit 2.1 Entropic Research Cambridge 1997