Auditory speaker discrimination by forensic phoneticians and naive listeners in voiced and whispered speech

Authors

  • Anna Bartle Metropolitan Police UK
  • Volker Dellwo University of Zurich

DOI:

https://doi.org/10.1558/ijsll.v22i2.23101

Keywords:

auditory speaker discrimination, whispered speech, expert listeners

Abstract

In whispered speech some important cues to a speaker’s identity (e.g. fundamental frequency, intonation) are inevitably absent. In the present study we investigated listeners’ ability to discriminate between speakers in short utterances in voiced and whispered speech. The performances of a group of 11 forensic phoneticians and a group of 22 naive listeners were compared in a binary forced-choice speaker discrimination task, with 48 same-speaker and 60 different-speaker pairs of short speech samples (? 3 s) in each test. Listeners were asked to say whether the two voice samples in each pair were produced by the same or different speakers, and to give a certainty rating. The results showed that speaker discrimination is more difficult in whispered than in voiced speech, and that while the phoneticians were only slightly better than the naive listeners in voiced speech, the gap widened in whispered speech. Phoneticians were more cautious in their responses, but also more accurate than naive listeners. When unsure, the phoneticians tended to say two utterances came from different speakers, whereas naive listeners tended to say two utterances came from the same speaker. Results support the view that trained phoneticians may have an advantage over naive listeners in auditory speaker discrimination when the signal is degraded.

Author Biographies

  • Anna Bartle, Metropolitan Police UK
    Anna Bartle is a Forensic Audio Specialist at the Metropolitan Police in London, UK. She holds a BMus (Tonmeister) degree in Music and Sound Recording from the University of Surrey and a MA in Phonetics from University College London. The research described in this paper formed part of her MA thesis.
  • Volker Dellwo, University of Zurich
    Volker Dellwo (MA, PhD) is Assistant Professor in Phonetics and Speech Sciences in the Phonetics Lab at Zurich University (UZH) and occasionally works as an expert witness in Forensic Phonetics. His research interests lie in a wide variety of duration, rhythm and timing phenomena in speech, typically in relation to speaker individuality. He is the principal investigator in two major funded research grants addressing temporal aspects in speaker individuality. Recent publications: • Dellwo, V., Hove, I., Leemann, A. and Kolly, M.-J. (2014) Verbrecherjagd mit gesprochener Sprache: Möglichkeiten und Grenzen der forensischen Phonetik. In:Kriminalistik (2/2014), 119-126. • Kolly, M.-S. and Dellwo, V. (2014) Cues to linguistic origin: The contribution of speech temporal information to foreign accent recognition. In: JPhon (42), 12-23. • Leemann, A., Kolly, M.-J., Dellwo, V. (2014) Speaker-individuality in supra-segmental temporal features: Implications for forensic voice comparison. In: Forensic Science International,(238), 59-67.

References

Blatchford H. and Foulkes P. (2006) Identification of voices in shouting, International Journal of Speech, Language and the Law 13: 241–254.


Boersma, P. and Weenink, D. (2009) Praat: Doing Phonetics by Computer (V5.1.03). Computer program downloaded from http://www.praat.org.


Brungart, D. S., Scott, K. and Simpson, B. (2001) The influence of vocal effort on human speaker identification, EUROSPEECH-2001: 747–750.


Czajkowski, A. (2010) Vocal tract resonances in voiced and whispered speech and listeners perception of voice depth and pitch. Presentation at the International Association for Forensic Phonetics and Acoustics annual conference, Trier, 18–21 July 2010.


Deterding, D. (2006) The North Wind versus a Wolf: short texts for the description and measurement of English pronunciation. Journal of the International Phonetic Association 36(2): 187–196. http://dx.doi.org/10.1017/S0025100306002544


Evans, I. and Foulkes, P. (2009) Speaker identification in whisper. Presentation at the International Association for Forensic Phonetics and Acoustics Annual Conference, Cambridge, 2–5 August 2009.


Foulkes, P. and Barron, A. (2000) Telephone speaker recognition amongst members of a close social network. International Journal of Speech Language and the Law 7(2): 180–198. http://dx.doi.org/10.1558/sll.2000.7.2.180


French, P., Cawley, L., de Jong, G., Duckworth, M., Foulkes, P., Harrison, P., Hudson, T., McDougall, K. and Nolan, F. (2007) Position statement concerning use of impressionistic likelihood terms in forensic speaker comparison cases. www.forensic-speech-science.info/position.html


Hollien, H. F. (2002) Forensic Voice Identification. San Diego, CA: Academic Press.


Jessen, M. (2008) Forensic phonetics. Language and Linguistics Compass 2(4): 671–711. http://dx.doi.org/10.1111/j.1749-818X.2008.00066.x


Köster, J. P. (1987) Leistung von Experten und Naiven in der auditiven Sprechererkennung. In R. Weiss (ed.) Festschrift für H. Wängler 171–180. Hamburg: Buske.


Künzel, H. J. (1990) Phonetische Untersuchungen zur Sprecher-Erkennung durch linguistisch naïve Personnen. Stuttgart: Steiner.


Künzel, H. J. (1995) Field procedures in forensic speaker recognition. In J. Windsor Lewis (ed.) Studies in General and English Phonetics: Essays in Honour of Professor J. D. O’Connor 68–84. Abingdon: Routledge.


Lindsay, P. H. and Norman, D. A. (1972) Appendix B: operating characteristics. In Human Information Processing: An Introduction to Psychology 665–682. New York: Academic Press.


McClelland, E. (2008) Voice recognition within a closed set of family members. Paper presented at the International Association for Forensic Phonetics and Acoustics Annual Conference, Lausanne, 20–23 July 2008.


Orchard, T. and Yarmey, A. D. (1995) The effects of whispers, voice-sample duration, and voice distinctiveness on criminal speaker identification. Applied Cognitive Psychology 9: 249–260. http://dx.doi.org/10.1002/acp.2350090306


Pallier, C. (2002) Computing discriminability and bias with the R software. Internet download from: www.pallier.org/ressources/aprime/aprime.pdf (19 March 2014).


Przedlacka, J. (2002) Estuary English? A Sociophonetic Study of Teenage Speech in the Home Counties. Frankfurt: Lang.


R v Flynn and another [2008] EWCA Crim 970.


Rostolland, D. (1982) Acoustic features of shouted voice. Acustica 50: 118–125.


Schiller, N. O. and Köster, O. (1998) The ability of expert witnesses to identify voices: a comparison between trained and untrained listeners. Forensic Linguistics 5(1): 1–9.


Shirt, M. (1984) An auditory speaker-recognition experiment. Proceedings of the Institute of Acoustics 6(1): 101–104.


Swets, J. A. (1973) The relative operating characteristic in psychology: a technique for isolating effects of response bias finds wide use in the study of perception and cognition. Science 182(4116): 990–1000. http://dx.doi.org/10.1126/science.182.4116.990


Wagner, I. and Köster, O. (1999) Perceptual recognition of familiar voices using falsetto as a type of voice disguise. Proceedings of the XIVth International Congress of Phonetic Sciences, San Francisco, 1381–1385.


Wells, J. C. (1982) Accents of English: An Introduction (vols 1 and 2). Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/cbo9780511611759

Published

2015-11-06

Issue

Section

Articles

How to Cite

Bartle, A., & Dellwo, V. (2015). Auditory speaker discrimination by forensic phoneticians and naive listeners in voiced and whispered speech. International Journal of Speech, Language and the Law, 22(2), 229-248. https://doi.org/10.1558/ijsll.v22i2.23101