Aural and automatic forensic speaker recognition in mismatched conditions

Authors

  • Anil Alexander Ecole Polytechnique Federale de Lausanne
  • Damien Dessimoz University of Lausanne
  • Filippo Botti University of Lausanne
  • Andrzej Drygajlo Ecole Polytechnique Federale de Lausanne

DOI:

https://doi.org/10.1558/sll.2005.12.2.214

Keywords:

aural speaker recognition, automatic speaker recognition, strength of evidence, mismatched recording conditions

Abstract

In this article, we compare aural and automatic speaker recognition in the context of forensic analyses, using a Bayesian framework for the interpretation of evidence.We use perceptual tests performed by non-experts and compare their performance with that of an automatic speaker recognition system. These experiments are performed with 90 phonetically untrained subjects. Several forensic cases were simulated, using the Polyphone IPSC-02 database, varying in linguistic content and technical conditions of recording. We estimate the strength of evidence for both humans and the baseline automatic system, calculating likelihood ratios using perceptual scores for humans and log-likelihood scores for the automatic system. A methodology analogous to the Bayesian interpretation in forensic automatic speaker recognition is applied to the perceptual scores given by humans in order to estimate the strength of evidence. The degradation of the accuracy of human recognition in mismatched recording conditions is contrasted with that of the automatic system under similar recording conditions. The conditions considered are fixed telephone, cellular telephone and noisy speech in forensically realistic conditions. The perceptual cues that the human subjects use to perceive differences in voices are studied, along with their importance in different recording conditions. We observe that while automatic speaker recognition shows higher accuracy in matched conditions of training and testing, its performance degrades significantly in mismatched conditions. Aural recognition accuracy is also observed to degrade from matched conditions to mismatched conditions and in mismatched conditions, the baseline automatic systems showed comparable or slightly degraded performance compared to the aural recognition systems. The baseline automatic system with adaptation to noisy conditions showed comparable or better performance than aural recognition. The higher level perceptual cues used by human listeners in order to recognise speakers are discussed. We also discuss the possibility of increasing the accuracy of automatic systems using the perceptual cues that remain robust to mismatched recording conditions.

Author Biographies

  • Anil Alexander, Ecole Polytechnique Federale de Lausanne
    Anil Alexander received his BTech in computer science and engineering from the Indian Institute of Technology, Madras in 2000. Currently he is pursuing his PhD studies in forensic automatic speaker recognition with the Speech Processing and Biometrics Group (GTPB).
  • Damien Dessimoz, University of Lausanne
    Damien Dessimoz received his Msc in forensic science from the University of Lausanne in 2001 and is currently pursuing his PhD studies in biometrics.
  • Filippo Botti, University of Lausanne
    Filippo Botti received his Msc in forensic science from the University of Lausanne in 2001and is currently pursuing his PhD studies in forensic audio and speech.
  • Andrzej Drygajlo, Ecole Polytechnique Federale de Lausanne
    Andrzej Drygajlo received his Msc and PhD in electronical engineering from the Silesian Technical University, Poland, in 1974 and 1983, respectively. Since 1990 he has been affiliated with the Ecole Polytechnique Federale de Lausanne and is the author/co-author of more than 90 research publications.

Published

2005-08-14

Issue

Section

Articles

How to Cite

Alexander, A., Dessimoz, D., Botti, F., & Drygajlo, A. (2005). Aural and automatic forensic speaker recognition in mismatched conditions. International Journal of Speech, Language and the Law, 12(2), 214-234. https://doi.org/10.1558/sll.2005.12.2.214