Evaluation of Voice Biometrics for Identification and Authentication

Nikhitha Bekkanti; Leah Busch; Scott Amman

doi:10.4271/2021-01-0262

The work presented here is part of the research done in the field of voice biometrics. This paper helps to understand the state-of-the-art in speaker recognition technology potentially capable of solving challenges related to speaker identification (to identify a speaker among multiple speakers) and speaker verification/authentication (to recognize the current speaking person at a pre-defined access level and authenticate accordingly). The research was focused on performing an unbiased evaluation of two individual voice biometric services. The level of accuracy in identifying and authenticating individuals using these services provides an insight into the current state of technology and the state of what other dual authentication methods could be used to achieve a desired True Acceptance Rate (TAR) and False Acceptance Rates (FAR).

Several factors like: complexity, ease of use for enrollment, effect of background noise, distance from microphone, and length of authentication speech, were considered in order to evaluate the technology for interior/exterior use cases. A generic strategy was designed to evaluate the services using the same test conditions. Obtaining false acceptance rates lead to further study of the need for a dual authentication system for business-critical use cases (e.g., payment transactions, authorized entry into a vehicle) and non-critical business use cases (e.g., personalized audio/seat settings, suggestions, general queries, etc.)

This research showed that enrollment can be done on random speech and that lower number of enrollees is better for speaker identification. It was also determined that signal to noise ratio (SNR) and distance from microphone have a significant effect on speaker identification. In business-critical use cases, it was concluded that voice biometric technology cannot be used as a standalone authentication method and needs to be paired with other authentication methods like facial recognition, passwords, vein recognition, etc., along with voice for secure authentication.