Automatic speaker recognition with crosslanguage speech material

Hermann J. Künzel

doi:10.1558/ijsll.v20i1.21

Authors

Hermann J. Künzel University of Marburg

DOI:

https://doi.org/10.1558/ijsll.v20i1.21

Keywords:

forensic speaker recognition, automatic speaker recognition, cross-language speech material, transmission channel characteristics

Abstract

Automatic systems for forensic speaker recognition (FASR) claim to be largely independent of language based on the fact that feature vectors are composed of acoustic parameters that are derived from the resonance characteristics of vocal tract cavities. Yet a certain ‘language gap’ may remain which may deteriorate the performance of a system unless properly compensated. This forensic aspect of what may be called cross-language speaker recognition has not yet received due attention. Based on the most common forensic cross language setting, the aim of this study was to assess the effect of language mismatch on the performance of a standard FASR system and compare its magnitude with the effect of other sources of mismatch on the same voice data. Using the automatic system Batvox 3 in an experiment with 75 bilingual speakers of seven languages and four kinds of transmission channels, it can be shown that, if speaker model and reference population are matched in terms of language, the remaining mismatch between speaker model and test sample can be neglected, since equal error rates (EERs) for same-language or cross-language comparisons are approximately the same, ranging from zero to 5.6%. Transmission of the speech data via landline telephone, GSM and, for part of the corpus, VoIP (using Skype) caused EERs to rise by less than 1% on average.

Author Biography

Hermann J. Künzel, University of Marburg

Hermann J. Künzel is Professor of Phonetics at the University of Marburg, Germany. From 1985 to 1999 he was Head of the Speaker Identification & Tape Authentication Department of the Federal Criminal Police Office (BKA) in Wiesbaden, Germany. In the years 1980 to 1990 he was essential in the development of the acoustic-phonetic method of forensic speaker recognition (FSR) and has been working as a professional expert in FSR, speaker profiling, voice line-ups and non-speech related acoustic investigations (e.g. aircraft and shipping incidents) for courts and government institutions throughout Germany and worldwide. He has been applying automatic speaker identification to cases of lawful interception and in court since 2001.

Automatic speaker recognition with crosslanguage speech material

Authors

DOI:

Keywords:

Abstract

Author Biography

Downloads

Published

Issue

Section

License

How to Cite

Subscription

Information

Accessibility

Unsubscribe

Latest publications