Speaker-individuality in Fujisaki model f0 features: implications for forensic voice comparison

Authors

  • Adrian Leeman University of Cambridge
  • Hansjörg Mixdorff Beuth-Hochschule für Technik Berlin FB Informatik und Medien
  • Maria O'Reilly Trinity College Dublin
  • Marie-José Kolly University of Zurich
  • Volker Dellwo University of Zurich

DOI:

https://doi.org/10.1558/ijsll.v21i2.343

Keywords:

speaker individuality, prosody, intonation, speaking style variability

Abstract

Fundamental frequency (f0) is a highly speaker-specific feature. Consequently, practitioners often use f0 information in forensic casework. Current research principally examines the use of long-term f0 statistics such as f0 means and standard deviations for forensic voice comparison. The present study investigates how short-term f0 features such as measured by the Fujisaki intonation model capture speaker-individuality. Based on data of a homogeneous group of Zurich German speakers, we conducted an experiment on a large corpus of read speech and on a subset of sentences that included speaking style variability (spontaneous vs. read). The latter is characteristic of forensic casework. Speakers demonstrated high between-speaker variability and low within-speaker variability across the two speaking styles for a number of f0 features. Given this evidence of speaker-individuality, we discuss Fujisaki f0 features’ potential for forensic voice comparison.

Author Biographies

  • Adrian Leeman, University of Cambridge
    Adrian Leemann is a visiting scholar at the Phonetics Laboratory, Department of Theoretical and Applied Linguistics, University of Cambridge. He is currently sponsored by the Swiss National Science Foundation to work on the project ‘The contribution of segmental and suprasegmental cues in the recognition of a speaker’s dialect’. He is the principal investigator in the Swiss National Science Foundation project Voice Äpp (www.voiceapp.ch). Adrian Leemann was previously a postdoctoral researcher at the University of Zurich where he was employed in Volker Dellwo’s project on ‘Forensic phonetic speaker identification based on temporal evidence’ – also sponsored by the Swiss National Science Foundation. His research interests include speakerindividuality, prosody, dialect recognition, and forensic phonetics. He is a member of IAFPA.
  • Hansjörg Mixdorff, Beuth-Hochschule für Technik Berlin FB Informatik und Medien
    Hansjörg Mixdorff is a Professor of Digital Audio and Video Processing at the University of Applied Sciences, Beuth-Hochschule für Technik, Berlin. His principal research interests are the modeling of prosodic features of speech, particularly in cross-language comparison, and Text to Speech systems. He applied the Fujisaki intonation model to the comparison of speaking styles and the analysis of foreign accent. More recently he is has been working on auditory-visual speech with a focus on non-verbal facial gestures and their alignment with prosodic features. He is a member of ISCA and member of the ISCA Special Interest Group on Speech Prosody.
  • Maria O'Reilly, Trinity College Dublin
    Maria O’Reilly is a researcher at the Phonetics and Speech Laboratory, Trinity College Dublin. She recently completed her PhD thesis on the intonation of Connemara and Donegal Irish in which the Fujisaki model of intonation was used in parallel with an Autosegmental-Metrical description and derived f0 parameters.Her teaching covers phonetics theory and experimental phonetics. Her research interests include the prosody of Irish Gaelic, as well as cross-dialect and cross-speaker variation in intonation.
  • Marie-José Kolly, University of Zurich
    Marie-José Kolly is a PhD student at the Phonetics Laboratory, Department of Comparative Linguistics, University of Zurich. She is currently working in the Swiss National Science Foundation Project ‘Forensic phonetic speaker identification based on temporal evidence’ and is writing her PhD thesis on speech temporal features in L2 speech. She is a co-investigaor in the Swiss National Science Foundation project ‘Swiss VoiceApp - Your Voice. Your identity.’ Previously, Marie-José Kolly did an internship at LINGUA, the Swiss federal institution responsible for LADO at the office for migration. Her research interests include forensic phonetics, speech rhythm, foreign accent, and first and second language acquisition. She is a member of IAFPA.
  • Volker Dellwo, University of Zurich
    Volker Dellwo is Professor of Phonetics and Phonology at the Department of Comparative Linguistics, University of Zurich. He is currently the principal investigator in the Swiss National Science Foundation Project ‘Forensic phonetic speaker identification based on temporal evidence’ and in the project ‘VoiceTime – Speaker recognition based on temporal information’ funded by the Gebert Rüf Foundation. Volker Dellwo’s research interests range over a wide variety of duration, rhythm, and timing phenomena in speech and he is particularly interested in improving methods for the extraction of rhythmic characteristics from the speech signal. He is a member of IAFPA.

Published

2015-02-18

Issue

Section

Articles

How to Cite

Leeman, A., Mixdorff, H., O'Reilly, M., Kolly, M.-J., & Dellwo, V. (2015). Speaker-individuality in Fujisaki model f0 features: implications for forensic voice comparison. International Journal of Speech, Language and the Law, 21(2), 343-370. https://doi.org/10.1558/ijsll.v21i2.343