Automatic Speaker Recognition as a Measurement of Voice Imitation and Conversion

Authors

  • Mireia Farrús Universitat Politècnica de Catalunya
  • Michael Wagner University of Canberra
  • Daniel Erro Universitat Politècnica de Catalunya
  • Javier Hernando Universitat Politècnica de Catalunya

DOI:

https://doi.org/10.1558/ijsll.v17i1.119

Keywords:

Imitation, voice conversion, prosody, jitter, shimmer, speaker recognition

Abstract

Voices can be deliberately disguised by means of human imitation or voice conversion. The question arises as to what extent they can be modified by using either of both methods. In the current paper, a set of speaker identification experiments are conducted; first, analysing some prosodic features extracted from voices of professional impersonators attempting to mimic a target voice and, second, using both intragender and crossgender converted voices in a spectral-based speaker recognition system. The results obtained in the current experiments show that the identification error rate increases when testing with imitated voices, as well as when using converted voices, especially the crossgender ones.

Author Biographies

  • Mireia Farrús, Universitat Politècnica de Catalunya
    Degree in Physics at University of Barcelona Degree in Linguistics at University of Barcelona PhD (October 2008) at Universitat Politècnica de Catalunya (Technical University of Catalonia) Department of Signal Theory and Communications
  • Michael Wagner, University of Canberra
    PhD; Professor of Computing at School of Information Sciences and Engineering Director of the National Centre for Biometric Studies University of Canberra
  • Daniel Erro, Universitat Politècnica de Catalunya
    PhD at Universitat Politècnica de Catalunya Department of Signal Theory and Communications
  • Javier Hernando, Universitat Politècnica de Catalunya
    PhD; Associate Professor at Universitat Politècnica de Catalunya

Published

2010-06-15

Issue

Section

Articles

How to Cite

Farrús, M., Wagner, M., Erro, D., & Hernando, J. (2010). Automatic Speaker Recognition as a Measurement of Voice Imitation and Conversion. International Journal of Speech, Language and the Law, 17(1), 119-142. https://doi.org/10.1558/ijsll.v17i1.119