Strength of forensic voice comparison evidence from the acoustics of filled pauses


  • Vincent Hughes University of York
  • Sophie Wood University of York
  • Paul Foulkes University of York



Forensic voice comparison, hesitation markers, likelihood ratio, formant dynamics, durations


This study investigates the evidential value of filled pauses (FPs, i.e. um, uh) as variables in forensic voice comparison. FPs for 60 young male speakers of standard southern British English were analysed, drawn from Task 1 of the DyViS corpus (Nolan et al. 2009). The following acoustic properties were analysed: midpoint frequencies of the first three formants in the vocalic portion; ‘dynamic’ characterisations of formant trajectories (i.e. quadratic polynomial equations fitted to nine measurement points over the entire vowel); vowel duration; and nasal duration for um. Likelihood ratio (LR) scores were computed using the Multivariate Kernel Density formula (MVKD; Aitken and Lucy, 2004) and converted to calibrated log10 LRs (LLRs) using logistic-regression (Brümmer et al., 2007). System validity was assessed using both equal error rate (EER) and the log LR cost function (Cllr; Brümmer and du Preez, 2006). The system with the best performance combines dynamic measurements of all three formants with vowel and nasal duration for um, achieving an EER of 4.08% and Cllr of 0.12. In terms of general patterns, um consistently outperformed uh. For um, the formant dynamic systems generated better validity than those based on midpoints, presumably reflecting the additional degree of formant movement in um caused by the transition from vowel to nasal. By contrast, midpoints outperformed dynamics for the more monophthongal uh. Further, the addition of duration (vowel or vowel and nasal) consistently improved system performance. The study supports the view that FPs have excellent potential as variables in forensic voice comparison cases.

Author Biographies

Vincent Hughes, University of York

Vincent Hughes is Lecturer in Forensic Speech Science at the University of York. In 2015, he was a post-doctoral research assistant on the project Voice and Identity – Source, Filter, Biometric (funded by the UK Arts and Humanities Research Council #AH/M003396/1, 2015-17). His research interests lie in forensic speech science, phonetics, phonology, sociophonetics and sociolinguistics. He is a member of the International Association of Forensic Phonetics and Acoustics.

Sophie Wood, University of York

Sophie Wood works for the UK Civil Service. She holds undergraduate (BA English Language and Linguistic Science) and postgraduate (MSc Forensic Speech Science) degrees from the University of York. She researched filled pauses as a discriminatory parameter for forensic speaker comparison for her MSc dissertation and presented at the 2014 IAFPA conference. Sophie also worked on the project ‘Perceptual adaptation to regional accents as a new lens on the puzzle of spoken word recognition’.

Paul Foulkes, University of York

Paul Foulkes is Professor in the Department of Language and Linguistic Science, University of York. His teaching and research interests include forensic phonetics, laboratory phonology, phonological development, and sociolinguistics. His current collaborators include Cathi Best, Jean-Pierre Chevrot, Gerry Docherty, Bronwen Evans, Peter French, Bill Haddican, Jen Hay, Vincent Hughes, Jason Shaw, Marilyn Vihman and Kim Wilson. He has worked on over 200 forensic cases from the UK, Ghana and New Zealand.


How to Cite

Hughes, V., Wood, S., & Foulkes, P. (2016). Strength of forensic voice comparison evidence from the acoustics of filled pauses. International Journal of Speech, Language and the Law, 23(1), 99–132.




