Individual patterns of disfluency across speaking styles: a forensic phonetic investigation of Standard Southern British English

Kirsty McDougall; Martin Duckworth

doi:10.1558/ijsll.37241

Authors

Kirsty McDougall University of Cambridge
Martin Duckworth Independent researcher

DOI:

https://doi.org/10.1558/ijsll.37241

Keywords:

fluency behaviour, disfluency features, TOFFA, individual differences, speaker-specificity, speaking style

Abstract

Features of speech related to fluency such as filled and silent pauses, sound prolongations, repetitions and self-interruptions exhibit considerable variation among speakers, yet the speaker-specificity of such features has received little attention inforensic phonetic research. The present study investigates the extent to which individual differences in disfluency behaviour are preserved across different speaking styles, a key concern for forensic speaker comparison cases. Disfluency phenomena in the speech of 20 male speakers of Standard Southern British English undertaking a simulated police interview task are compared with the occurrence of the same set of phenomena in the speech of the same speakers participating in a telephone conversation with an 'accomplice'. The speakers' disfluency features are analysed using TOFFA 'Taxonomy of Fluency Features for Forensic Analysis' (McDougall and Duckworth 2017). Individuals exhibit a wide range of variation in their overall rate of production of disfluency features, and these rates are relatively consistent within-speaker across interview and telephone styles. The results for each specific disfluency feature type also show patterns of relatively consistent behaviour within-speaker across-style for most features. For both interview and telephone styles, discriminant analyses based on speaker profiles of disfluency features demonstrate that disfluency features carry speaker-specific information which could be considered alongside other analyses in forensic speaker comparison cases.

Author Biographies

Kirsty McDougall, University of Cambridge

Kirsty McDougall is an Affiliated Lecturer in Theoretical and Applied Linguistics at the University of Cambridge and a Fellow of Selwyn College, Cambridge. Her research interests range across speaker characteristics, theories of speech production, phonetic realisation of varieties of English, and forensic phonetics. Among other things, her forensic phonetic research has focused on speaker-distinguishing properties of dynamic features of speech, perceived voice similarity and its implications for the selection of foils for voice parades, and the development of techniques for analysing individual differences in disfluency behaviour. She is a member of IAFPA.
Martin Duckworth, Independent researcher

Martin Duckworth is a qualified speech and language therapist and has an MA in Phonetics and Linguistics. He worked for over 30 years as a therapist specialising in adults who stutter. He also taught phonetics, and speech therapy. Alongside this work he undertook forensic speaker comparisons until his retirement from casework in 2016. He is now an independent researcher with a specific interest in the application of fluency measurement in forensic casework. He is a member of IAFPA.

References

Allen, S., Miller, J. L. and DeSteno, D. (2003) Individual talker differences in voice-onset time. Journal of the Acoustical Society of America 113(1): 544–552. https://doi.org/10.1121/1.1528172

Alzqhoul, E. A. S., Nair, B. B. T. and Guillemin, B. J. (2015) Impact of dynamic rate coding aspects of mobile phone networks on forensic voice comparison. Science and Justice 55(5): 363–374. https://doi.org/10.1016/j.scijus.2015.04.006

Amino, K. and Arai, T. (2009) Speaker-dependent characteristics of the nasals. Forensic Science International 185(1–3): 21–28. https://doi.org/10.1016/j.forsciint.2008.11.018

Bachorowski, J.-A. and Owren, M. J. (1999) Acoustic correlates of talker sex and individual talker identity are present in a short vowel segment produced in running speech. Journal of the Acoustical Society of America 106(2): 1054–1063. https://doi.org/10.1121/1.427115

Boersma, P. and Weenink, D. (1992–2018) Praat: A System for Doing Phonetics by Computer [computer program]. http://www.praat.org/

Bortfeld, H., Leon, S. D., Bloom, J. E., Schober, M. F. and Brennan, S. E. (2001) Disfluency rates in conversation: effects of age, relationship, topic, role and gender. Language and Speech 44: 123. https://doi.org/10.1177/00238309010440020101

Braun, A. (1995) Fundamental frequency – how speaker-specific is it? In A. Braun and J.-P. Köster (eds.) Studies in Forensic Phonetics (Beiträge zur Phonetik und Linguistik 64) 9–23. Trier: Wissenschaftlicher Verlag Trier.

Braun, A. and Künzel, H. J. (2003) The effect of alcohol on speech prosody. In M. J. Solé, D. Recasens and J. Romero (eds.) Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona 2645–2648. https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2003/papers/p2615_2645.pdf.

Braun, A. and Rosin, A. (2015) On the speaker-specificity of hesitation markers. In The Scottish Consortium for ICPhS 2015 (ed.) Proceedings of the 18th International Congress of Phonetic Sciences, Glasgow. Paper number 731. http://www.icphs2015.info/pdfs/Papers/ICPHS0731.pdf.

Clark, H. H. and Fox Tree, J. E. (2002) Using uh and um in spontaneous speaking. Cognition 84(1): 73–111. https://doi.org/10.1016/S0010-0277(02)00017-3

de Jong, G., McDougall, K., Hudson, T. and Nolan, F. (2007) The speaker-discriminating power of sounds undergoing historical change: a formant-based study. In J. Trouvain and W. J. Barry (eds.) Proceedings of the 16th International Congress of Phonetic Sciences, Saarbrücken 1813–1816. http://www.icphs2007.de/conference/Papers/1542/1542.pdf.

Dellwo, V. and Koreman, J. (2008) How speaker idiosyncratic is measurable speech rhythm? Paper presented at the International Association for Forensic Phonetics and Acoustics Annual Conference, Lausanne, 20–23 July 2008.

Dellwo, V., Leemann, A. and Kolly, M.-J. (2015) Rhythmic variability between speakers: articulatory, prosodic, and linguistic factors. Journal of the Acoustical Society of America 137(3): 1513–1528. https://doi.org/10.1121/1.4906837

Eklund, R. (2004) Disfluency in Swedish human–human and human–machine travel booking dialogues. Linköping University Studies in Science and Technology Dissertation No. 882 http://www.ida.liu.se/~robek28/pdf/Eklund_2004_PhD_Thesis_Corrected.pdf.

Finlayson, I. R. and Corley, M. (2012) Disfluency in dialogue: an intentional signal from the speaker? Psychonomic Bulletin and Review 19(5): 921–928. https://doi.org/10.3758/s13423-012-0279-x

Goldman-Eisler, F. (1968) Psycholinguistics: Experiments in Spontaneous Speech. London: Academic Press.

Guillemin, B. J. and Watson, C. (2008) Impact of the GSM mobile phone network on the speech signal – some preliminary findings. International Journal of Speech, Language and the Law 15(2): 193–218.

Hollien, H., de Jong, G., Martin, C. A., Schwartz, R. and Liljegren, K. (2001) Effects of ethanol intoxication on speech suprasegmentals. Journal of the Acoustical Society of America 110(6): 3198–3206. https://doi.org/10.1121/1.1413751

Hollien, H., Liljegren, K., Martin, C. A. and de Jong, G. (1999) Prediction of intoxication levels by speech analysis. In A. Braun (ed.) Advances in Phonetics 40–50. Stuttgart: Steiner Verlag.

Hudson, T., de Jong, G., McDougall, K., Harrison, P. and Nolan, F. (2007) F0 statistics for 100 young male speakers of Standard Southern British English. In J. Trouvain and W. J. Barry (eds.) Proceedings of the 16th International Congress of Phonetic Sciences, Saarbrücken 1809–1812. http://www.icphs2007.de/conference/Papers/1570/1570.pdf.

Hughes, V. (2014). The definition of the relevant population and the collection of data for likelihood ratio-based forensic voice comparison. PhD dissertation, University of York.

Hughes, V., Wood, S. and Foulkes, P. (2016) Strength of forensic voice comparison evidence from the acoustics of filled pauses. International Journal of Speech Language and the Law 23(1): 99–132. https://doi.org/10.1558/ijsll.v23i1.29874

Huntley Bahr, R. and Pass, K. J. (1996) The influence of style-shifting on voice identification. Forensic Linguistics 3(1): 24–38.

Ishihara, S. and Kinoshita, Y. (2008) How many do we need? Exploration of the population size effect on the performance of forensic speaker classification. In Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech), Brisbane, Australia 1941–1944.

Jessen, M. (1997) Speaker-specific information in voice quality parameters. Forensic Linguistics 4(1): 84–103.

Jessen, M. (2007) Forensic reference data on articulation rate in German. Science and Justice 47(2): 50–67. https://doi.org/10.1016/j.scijus.2007.03.003

Jessen, M. (2008) Forensic phonetics. Language and Linguistics Compass 2(4): 671–711. https://doi.org/10.1111/j.1749-818X.2008.00066.x

Jessen, M., Köster, O. and Gfroerer, S. (2005) Influence of vocal effort on average and variability of fundamental frequency. International Journal of Speech Language and the Law 12(2): 174–213. https://doi.org/10.1558/sll.2005.12.2.174

Kasl, S. V. and Mahl, G. F. (1965) Relationship of disturbances and hesitations in spontaneous speech to anxiety. Journal of Personality and Social Psychology 1(5): 425–433. https://doi.org/10.1037/h0021918

Kavanagh, C. (2012) New consonantal acoustic parameters for forensic speaker comparison. PhD dissertation, University of York.

Kinoshita, Y., Ishihara, S. and Rose, P. (2009) Exploring the discriminatory potential of F0 distribution parameters in traditional forensic speaker recognition. International Journal of Speech, Language and the Law 16(1): 91–111. https://doi.org/10.1558/ijsll.v16i1.91

Klatt, D. (1976) Linguistic uses of segmental duration in English: acoustic and perceptual evidence. The Journal of the Acoustical Society of America 59(5): 1208–1221. https://doi.org/10.1121/1.380986

Künzel, H. J. (1989) How well does average fundamental frequency correlate with speaker height and weight? Phonetica 46(1–3): 117–125. https://doi.org/10.1159/000261832

Künzel, H. J. (1997) Some general phonetic and forensic aspects of speaking tempo. Forensic Linguistics 4(1): 48–83.

Künzel, H. J. (2001) Beware of the ‘telephone effect’: the influence of telephone transmission on the measurement of formant frequencies. Forensic Linguistics 8(1): 80–99.

Leemann, A. and Kolly, M.-J. (2015) Speaker-invariant suprasegmental temporal features in normal and disguised speech. Speech Communication 75: 97–122. https://doi.org/10.1016/j.specom.2015.10.002

Leemann, A., Kolly, M.-J. and Dellwo, V. (2014) Speaker-individuality in suprasegmental temporal features: implications for forensic voice comparison. Forensic Science International 238: 59–67. https://doi.org/10.1016/j.forsciint.2014.02.019

Leemann, A., Mixdorff, H., O’Reilly, M., Kolly, M.-J. and Dellwo, V. (2014) Speaker-individuality in Fujisaki model f0 features: implications for forensic voice comparison. International Journal of Speech, Language and the Law 21(2): 343–370. https://doi.org/10.1558/ijsll.v21i2.343

Lindblom, B. (1963) Spectrographic study of vowel reduction. Journal of the Acoustical Society of America 35(11): 1773–1781. https://doi.org/10.1121/1.1918816

Lindh, J. and Eriksson, A. (2007) Robustness of long time measures of fundamental frequency. In Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech), Antwerp, Belgium 2025–2028.

McDougall, K. (2004) Speaker-specific formant dynamics: an experiment on Australian English /a?/. International Journal of Speech, Language and the Law 11(1): 103–130. https://doi.org/10.1558/sll.2004.11.1.103

McDougall, K. (2006) Dynamic features of speech and the characterization of speakers: toward a new approach using formant frequencies. International Journal of Speech, Language and the Law 13(1): 89–126. https://doi.org/10.1558/sll.2006.13.1.89

McDougall, K. and Duckworth, M. (2017) Profiling fluency: an analysis of individual variation in disfluencies in adult males. Speech Communication 95: 16–27. https://doi.org/10.1016/j.specom.2017.10.001

McDougall, K. and Nolan, F. (2007) Discrimination of speakers using the formant dynamics of /u?/ in British English. In J. Trouvain and W. J. Barry (eds.) Proceedings of the 16th International Congress of Phonetic Sciences, Saarbrücken 1825–1828. http://www.icphs2007.de/conference/Papers/1567/1567.pdf.

McDougall, K., Rhodes, R., Duckworth, M., French, J. P., Kirchhübel, C. and Wormald, J. (2018) Applying disfluency analysis in forensic speaker comparison casework. Paper presented at the International Association for Forensic Phonetics and Acoustics Annual Conference, Huddersfield, 29 July–1 August 2018, Huddersfield.

Moos, A. (2010) Long-term formant distribution as a measure of speaker characteristics in read and spontaneous speech. The Phonetician 101/102: 7–24.

Morrison, G. S. (2008) Forensic voice comparison using likelihood ratios based on polynomial curves fitted to the formant trajectories of Australian English /a?/. International Journal of Speech, Language and the Law 15(2): 249–266.

Nair, B. B. T., Alzqhoul, E. A. S. and Guillemin, B. J. (2016) Impact of the GSM and CDMA mobile phone networks on the strength of speech evidence in forensic voice comparison. Journal of Forensic Research 7(2): 322. https://www.omicsonline.org/open-access/impact-of-the-gsm-and-cdma-mobile-phone-networks-on-the-strength-ofspeech-evidence-in-forensic-voice-comparison-2157-7145-1000324.pdf. https://doi.org/10.4172/2157-7145.1000324

Nolan, F. (1983) The Phonetic Bases of Speaker Recognition. Cambridge: Cambridge University Press.

Nolan, F. (1997) Speaker recognition and forensic phonetics. In W. J. Hardcastle and J. Laver (eds.) The Handbook of Phonetic Sciences 744–767. Oxford: Blackwell.

Nolan, F. (2002) Intonation in speaker identification: an experiment on pitch alignment features. Forensic Linguistics 9(1): 1–21. https://doi.org/10.1558/sll.2002.9.1.1

Nolan, F. and Grigoras, C. (2005) A case for formant analysis in forensic speaker identification. International Journal of Speech, Language and the Law 12(2): 143–173. https://doi.org/10.1558/sll.2005.12.2.143

Nolan, F., McDougall, K., de Jong, G. and Hudson, T. (2009) The DyViS database: style-controlled recordings of 100 homogeneous speakers for forensic phonetic research. International Journal of Speech, Language and the Law 16(1): 31–57.

Oviatt, S. (1995) Predicting spoken disfluencies during human–computer interaction. Computer Speech and Language 9(1): 19–35. https://doi.org/10.1006/csla.1995.0002

Peterson, G. E. and Barney, H. L. (1952) Control methods used in a study of the vowels. Journal of the Acoustical Society of America 24(2): 175–184. https://doi.org/10.1121/1.1906875

Roberts, P. M., Meltzer, A. and Wilding, J. (2009) Disfluencies in non-stuttering adults across sample lengths and topics. Journal of Communication Disorders 42(6): 414–427. https://doi.org/10.1016/j.jcomdis.2009.06.001

Rose, P. (1999) Long- and short-term within-speaker differences in the formants of Australian hello. Journal of the International Phonetic Association 29(1): 1–31. https://doi.org/10.1017/S0025100300006393

Rose, P. (2002) Forensic Speaker Identification. London: Taylor and Francis. https://doi.org/10.1201/9780203166369

Rose, P. (2010) The effect of correlation on strength of evidence estimates in Forensic Voice Comparison: uni- and multivariate Likelihood Ratio-based discrimination with Australian English vowel acoustics. International Journal of Biometrics 2(4): 316–329. https://doi.org/10.1504/IJBM.2010.035447

Schachter, S., Christenfeld, N., Ravina, B. and Bilous, F. (1991) Speech disfluency and the structure of knowledge. Journal of Personality and Social Psychology 60(3): 362–367. https://doi.org/10.1037/0022-3514.60.3.362

Schiel, F. and Heinrich, C. (2015) Disfluencies in the speech of intoxicated speakers. International Journal of Speech Language and the Law 22(1): 19–33. https://doi.org/10.1558/ijsll.v22i1.24767

Schiel, F., Heinrich, C. and Barfüßer, S. (2012) Alcohol language corpus: the first public corpus of alcoholized German speech. Language Resources and Evaluation 46(3): 503–521. https://doi.org/10.1007/s10579-011-9139-y

Shriberg, E. (2001) To ‘errrr’ is human: ecology and acoustics of speech disfluencies. Journal of the International Phonetic Association 31(1): 153–169. https://doi.org/10.1017/S0025100301001128

Tabachnick, B. G. and Fidell, L. S. (2014) Using Multivariate Statistics (6th edn). Harlow: Pearson.

Wightman, C. W., Shattuck-Hufnagel, S., Ostendorf, M. and Price, P. J. (1992) Segmental durations in the vicinity of prosodic phrase boundaries. The Journal of the Acoustical Society of America 91(3): 1707–1717. https://doi.org/10.1121/1.402450

Individual patterns of disfluency across speaking styles

a forensic phonetic investigation of Standard Southern British English

Authors

DOI:

Keywords:

Abstract

Author Biographies

References

Downloads

Published

Issue

Section

License

How to Cite

Subscription

Information

Accessibility

Unsubscribe

Latest publications