Acoustic characteristics of disguised speech
speaker strategies and listener error patterns
DOI:
https://doi.org/10.1558/ijsll.38372Keywords:
speaker identification, voice identification, acoustic cues robust to disguise, disguised speechAbstract
A group of 13 participants were recorded in two conditions: 1) speaking normally and 2) altering speech to conceal their identity (i.e., disguised speech). Participants were not instructed how to disguise their speech because we were interested in which changes they would choose. A group of inexperienced listeners were largely inaccurate in matching participants' disguised speech to their normal speech. The largest changes between normal and disguised speech were in speaking rate, the first formant, fundamental frequency, and intensity. When listeners made correct matches, the pairs were similar in speaking rate and fundamental frequency (F0), as shown by significant correlations. Incorrectly matched pairs were not significantly correlated, suggesting that listeners were not making good use of acoustic cues during those decisions. Overall, the third formant (F3) and speaking rate appeared to be useful acoustic indicators of identity when matching normal and disguised speech samples. Of those two variables, F3 was apparently underutilised by listeners. The implications for what spontaneous speakers do to disguise their speech and what naïve listeners attend to when identifying disguised voice are discussed.
References
Amin, T. B. (2013) Detecting voice disguise from speech variability: analysis of three glottal and vocal tract measures Journal of the Acoustical Society of America 134: 4068. https://doi.org/10.1121/1.4830853
Bartle, A. and Dellwo, V. (2015) Auditory speaker discrimination by forensic phoneticians and naive listeners in voiced and whispered speech. International Journal of Speech, Language and the Law 22(2): 229–248. https://doi.org/10.1558/ijsll.v22i2.23101
Brown, R. (1981) An experimental study of the relative importance of acoustic parameters for auditory speaker recognition. Language and Speech 24(4): 295–310. https://doi.org/10.1177/002383098102400401
Cambier-Langeveld, T. (2007) Current methods in forensic speaker identification: results of a collaborative exercise. International Journal of Speech, Language and the Law 14(2): 223–243. https://doi.org/10.1558/ijsll.v14i2.223
Cao, W., Wang, H., Zhao, H., Qian, Q. and Abdullahi, S. M. (2016) Identification of electronic disguised voices in the noisy environment. In Y. Shi, H. Kim, F. Perez-Gonzalez and F. Liu (eds) Digital Forensics and Watermarking. IWDW 2016 75–87. Lecture Notes in Computer Science, vol 10082. Cham, Switzerland: Springer. https://doi.org/10.1007/978-3-319-53465-7_6
Clark, J. and Foulkes, P. (2007) Identification of voices in electronically disguised speech. International Journal of Speech, Language and the Law 14(2): 195–221. https://doi.org/10.1558/ijsll.v14i2.195
Dilda, G. S. and Hollien, H. (2015) Vocal disguise and speaker identification. Proceedings of Meetings on Acoustics, 25(1).
Eriksson, A. (2010) The disguised voice: imitating accents or speech styles and impersonating individuals. In C. Llamas and D. Watt (eds) Language and Identities 86–96. Edinburgh: Edinburgh University Press.
Foulkes, P. and French, P. (2012) Forensic speaker comparison: a linguistic-acoustic perspective. In L. M. Solan and P. M. Tiersma (eds) Oxford Handbook of Language and Law 557–572. Oxford: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199572120.013.0041
Gold, E. (2014) Calculating likelihood ratios for forensic speaker comparisons using phonetic and linguistic parameters. PhD dissertation, University of York.
Gold, E. and French, P. (2011) International practices in forensic speaker comparison. International Journal of Speech, Language and the Law 18(2): 293–307. https://doi.org/10.1558/ijsll.v18i2.293
Jessen, M. (2018) Forensic voice comparison. In J. Visconti and M. Rathert (eds) Handbook of Communication in the Legal Sphere 219–255. Berlin: Mouton de Gruyter. https://doi.org/10.1515/9781614514664-012
Künzel, K. (2000) Effects of voice disguise on speaking fundamental frequency. Forensic Linguistics 7(2): 149–179. https://doi.org/10.1558/sll.2000.7.2.149
Mathur, S., Choudhary, S. K. and Vyas, J. M. (2016) Effect of disguise on fundamental frequency of voice. Journal of Forensic Research: Open Access 7: 3.
Mitchell, H. L., Hoit, J. D. and Watson, P. J. (1996) Cognitive-linguistic demands and speech breathing. Journal of Speech, Language, and Hearing Research 39(1): 93–104. https://doi.org/10.1044/jshr.3901.93
Morrison, G. S. (2008) Forensic voice comparison using likelihood ratios based on polynomial curves fitted to the formant trajectories of Australian English /a?/. International Journal of Speech, Language and the Law 15(2): 249–266. https://doi.org/10.1558/ijsll.v15i2.249
Nolan, F., McDougall, K. and Hudson, T. (2013) Effects of the telephone on perceived voice similarity: implications for voice line-ups. International Journal of Speech, Language and the Law 22(2): 229–246. https://doi.org/10.1558/ijsll.v20i2.229
Remez, R. E., Fellowes, J. M. and Rubin, P. E. (1997) Talker identification based on phonetic information. Journal of Experimental Psychology: Human Perception and Performance 23(3): 651–666. https://doi.org/10.1037//0096-1523.23.3.651
Rose, P. (2002) Speaker Identification. London: Taylor & Francis.
Voiers, W. D. (1964) Perceptual bases of speaker identity. Journal of the Acoustical Society of America 36:1065–1073. https://doi.org/10.1121/1.1919153
Wu, H., Wang, Y. and Huang, J. (2014) Identification of electronic disguised voices. IEEE Transactions on Information Forensics and Security 9(3): 489–500. https://doi.org/10.1109/tifs.2014.2301912
Zetterholm, E., Sarwar, F., Thorvaldsson, V. and Allwood, C.M. (2012) Earwitnesses: the effect of type of vocal differences on correct identification and confidence accuracy. International Journal of Speech, Language and the Law 19(2): 219–237. https://doi.org/10.1558/ijsll.v19i2.219