Effects of the landline telephone filter and linguistic context on speaker-dependent variability in /s/
DOI:
https://doi.org/10.1558/ijsll.25337Keywords:
telephone speech, fricatives, speaker classificationAbstract
Previous research has demonstrated that vowel formants situated near the lower limit of the bandwidth show effects of the telephone filter (Künzel 2001; Byrne and Foulkes 2004). In this work, the effects of the landline telephone filter are examined for one consonant, namely fricative /s/. Although this speech sound is expected to show large effects of narrowband telephone filters due to its high-frequency spectral characteristics, previous work on Dutch telephone speech showed that, despite its compromised acoustics, /s/ still contains considerable amounts of speaker information (Smorenburg & Heeren, 2021). Using English data that were simultaneously recorded as broadband and telephone speech, this work shows large effects of the telephone filter on both the acoustics and linear discriminant analysis (LDA) speaker classification. Linguistic effects were only observable in studio recordings and generally did not have an effect on the speaker classification with one exception: when the following context is labial, LDA speaker-classification accuracy was higher, indicating idiosyncrasies in anticipatory labial coarticulation.
References
Barry, W. and Andreeva, B. (2001) Cross-language similarities and differences in spontaneous speech patterns. Journal of the International Phonetic Association 31(1): 51–66. https://doi.org/10.1017/S0025100301001050
Bell-Berti, F., and Harris, K. S. (1982) Temporal patterns of coarticulation: lip rounding. Journal of the Acoustical Society of America 71: 449–454. http://dx.doi.org/10.1109/tsa.2002.804299
Bessette, B., Salami, R., Lefebvre, R., Jelínek, M., Rotola-Pukkila, J., Vainio, J., Mikkola, H. and Järvinen, K. (2002) The Adaptive Multirate Wideband speech codec (AMR-WB). IEEE Transactions on Speech and Audio Processing 10: 620–636.
Boersma, P. and Weenink, D. (2023) Praat: doing phonetics by computer [Computer program]. Version 6.0.40., retrieved 1 April 2018 from http://www.praat.org/
Byrne, C. and Foulkes, P. (2004) The ‘mobile phone effect’ on vowel formants. International Journal of Speech Language and the Law 11: 83–102. http://dx.doi.org/10.1558/ijsll.v11i1.83
Collins, B. and Mees, I. M. (1984) The Sounds of English and Dutch (5th ed.). Leiden: Brill Archive.
Cunha, C. and Reubold, U. (2015) The contribution of vowel coarticulation and prosodic weakening in initial and final fricatives to sound change. Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow: IPA.
Ditewig, S., Pinget, A. C. H. and Heeren, W. F. L. (2019) Regional variation in the pronunciation of /s/ in the Dutch language area. Nederlandse Taalkunde 24: 195–212. http://dx.doi.org/10.5117/nedtaa2019.2.003.dite
Ditewig, S., Smorenburg, L., Quené, H. and Heeren, W. (2021) An acoustic-phonetic study of retraction of /s/ in Moroccan Dutch and endogenous Dutch. Nederlandse Taalkunde 26: 315–338. http://dx.doi.org/10.5117/nedtaa2021.3.001.dite
Forrest, K., Weismer, G., Milenkovic, P. and Dougall, R. N. (1988) Statistical analysis of word-initial voiceless obstruents: preliminary data. Journal of the Acoustical Society of America 84: 115–123. http://dx.doi.org/10.1121/1.396977
Gold, E. and French, P. (2011). International practices in forensic speaker comparison. International Journal of Speech, Language and the Law 18: 293–307. http://dx.doi.org/10.1558/ijsll.v18i2.293
Gold, E. and French, P. (2019) International practices in forensic speaker comparisons: second survey. International Journal of Speech, Language and the Law 26: 1–20. http://dx.doi.org/10.1558/ijsll.38028
Gold, E., Ross, S. and Earnshaw, K. (2018) The ‘West Yorkshire Regional English Database’: investigations into the generalizability of reference populations for forensic speaker comparison casework. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH September 2018: 2748–2752.
Guillemin, B. J. and Watson, C. I. (2006) Impact of the GSM AMR speech codec on formant information important to forensic speaker identification. In P. Warren and C. I. Watson (eds) Proceedings of the 11th Australian International Conference on Speech Science and Technology 483–488. Auckland: Australian Speech Science & Technology Association Inc.
He, L. and Dellwo, V. (2017) Between-speaker variability in temporal organizations of intensity contours. Journal of the Acoustical Society of America 141: EL488–EL494. http://dx.doi.org/10.1121/1.4983398
He, L., Zhang, Y. and Dellwo, V. (2019) Between-speaker variability and temporal organization of the first formant. Journal of the Acoustical Society of America 145: EL209–EL214. http://dx.doi.org/10.1121/1.5093450
Heeren, W. F. (2020) The effect of word class on speaker-dependent information in the Standard Dutch vowel /a:/. Journal of the Acoustical Society of America 148(4): 2028–2039. https://doi.org/10.1121/10.0002173
Holliday, J. J., Reidy, P. F., Beckman, M. E., and Edwards, J. (2015) Quantifying the robustness of the English sibilant fricative contrast in children. Journal of Speech, Language, and Hearing Research 58: 622–637. http://dx.doi.org/10.1044/2015_jslhr-s-14-0090
Hoole, P., Nguyen-Trong, N. and Hardcastle, W. (1993) A comparative investigation of coarticulation in fricatives: electropalatographic, electromagnetic, and acoustic data. Language and Speech 36: 235–260. http://dx.doi.org/10.1177/002383099303600307
Jongman, A., Wayland, R. and Wong, S. (2000) Acoustic characteristics of English fricatives. Journal of the Acoustical Society of America 108: 1252. http://dx.doi.org/10.1121/1.1288413
Junqua, J.-C., Fincke, S. and Field, K. (1999) The Lombard effect: a reflex to better communicate with others in noise. In 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing 2083–2086. New York: IEEE. http://dx.doi.org/10.1109/icassp.1999.758343
Kavanagh, C. M. (2012) New Consonantal Acoustic Parameters for Forensic Speaker Comparison. York: University of York.
Klecka, W. R. (1980) Discriminant analysis. In Quantitative Applications in the Social Sciences vol. 19. London: Sage Publications.
Koenig, L. L., Shadle, C. H., Preston, J. L. and Mooshammer, C. R. (2013) Toward improved spectral measures of /s/: results from adolescents. Journal of Speech Language and Hearing Research 56: 1175. http://dx.doi.org/10.1044/1092-4388(2012/12-0038)
Künzel, H. J. (2001) Beware of the ‘telephone effect’: the influence of telephone transmission on the measurement of formant frequencies. Forensic Linguistics 8: 80–99. http://dx.doi.org/10.1558/sll.2001.8.1.80
Li, F., Rendall, D., Vasey, P. L., Kinsman, M., Ward-Sutherland, A. and Diano, G. (2016) The development of sex/gender-specific /s/ and its relationship to gender identity in children and adolescents. Journal of Phonetics 57: 59–70. http://dx.doi.org/10.1016/j.wocn.2016.05.004
Magen, H. S. (1997) The extent of vowel-to-vowel coarticulation in English. Journal of Phonetics 25(2): 187–205. http://dx.doi.org/10.1006/jpho.1996.0041
McDougall, K. (2004) Speaker-specific formant dynamics: an experiment on Australian English /aI/. International Journal of Speech, Language and the Law 11: 103–130. http://dx.doi.org/10.1558/sll.2004.11.1.103
Monson, B. B., Lotto, A. J. and Story, B. H. (2012) Analysis of high-frequency energy in long-term average spectra of singing, speech, and voiceless fricatives. Journal of the Acoustical Society of America 132: 1754–1764. http://dx.doi.org/10.1121/1.4742724
Munson, B. (2004) Variability in /s/ production in children and adults. Journal of Speech Language and Hearing Research 47: 58–69.
Munson, B., McDonald, E. C., DeBoe, N. L. and White, A. R. (2006) The acoustic and perceptual bases of judgments of women and men’s sexual orientation from read speech. Journal of Phonetics 34(2): 202–240. http://dx.doi.org/10.1016/j.wocn.2005.05.003
Niebuhr, O., Clayards, M., Meunier, C. and Lancia, L. (2011) On place assimilation in sibilant sequences: comparing French and English. Journal of Phonetics 39: 429–451. http://dx.doi.org/10.1016/j.wocn.2011.04.003
Nittrouer, S. and Whalen, D. H. (1989) The perceptual effects of child–adult differences in fricative-vowel coarticulation. Journal of the Acoustical Society of America 86: 1266–1276. http://dx.doi.org/10.1121/1.398741
Nolan, F. (1983) The Phonetic Bases of Speaker Recognition. Cambridge Studies in Speech Science and Communication. Cambridge: Cambridge University Press.
Ohala, J. J. and Kawasaki, H. (1984) Prosodic phonology and phonetics. Phonology 1: 113–127. http://dx.doi.org/10.1017/s0952675700000312
Oliveira, M. and Freitas, T. (2008) Intonation as a cue to turn management in telephone and face-to-face interactions. In Proceedings of Speech Prosody 485–488. Campinas, Brazil: ISCA.
Pacilly, J. J. A. (2014) Annotate [Praat script]. In Workshop Analysis and Synthesis of Speech (Ch. Digital recordings 1). Retrieved from https://phonetics.pacilly.nl/misp/
Quené, H., Orr, R. and Van Leeuwen, D. (2017) Phonetic similarity of /s/ in native and second language: individual differences in learning curves. Journal of the Acoustical Society of America 142: 525. http://dx.doi.org/10.1121/1.5013149
R Core Team. (2019) R: A Language and Environment for Statistical Computing (version 4.0.1). Vienna: R Foundation for Statistical Computing.
Redford, M. A. and Diehl, R. L. (1999) The relative perceptual distinctiveness of initial and final consonants in CVC syllables. Journal of the Acoustical Society of America 106: 1555. http://dx.doi.org/10.1121/1.427152
Shadle, C. H. and Scully, C. (1995) An articulatory-acoustic-aerodynamic analysis of [s] in VCV sequences. Journal of Phonetics 23: 53–66. http://dx.doi.org/10.1016/s0095-4470(95)80032-8
Smorenburg, L. and Heeren, W. (2021) The distribution of speaker information in Dutch fricatives/s/and/x/from telephone dialogues. Journal of the Acoustical Society of America 147(2): 979–989. https://doi.org/10.1121/10.0005845
Solé, M. J. (2003) Aerodynamic characteristics of onset and coda fricatives. In M. J. Solé, D. Recasens and J. Romero (eds) Proceedings of the 15th International Congress of Phonetic Sciences 2761–2764. Barcelona.
Soli, S. D. (1981) Second formants in fricatives: acoustic consequences of fricative-vowel coarticulation. Journal of the Acoustical Society of America 70: 976–984. http://dx.doi.org/10.1121/1.387032
Stevens, K. N. (2000) Acoustic Phonetics vol. 30. London: MIT Press.
Stuart-Smith, J. (2007) Empirical evidence for gendered speech production: /s/ in Glaswegian. In J. Cole and J. I. Hualde (eds) Laboratory Phonology 9 65–86. New York: Mouton de Gruyter.
Su, L. S., Li, K. P. and Fu, K. S. (1974) Identification of speakers by use of nasal coarticulation. Journal of the Acoustical Society of America 56(6): 1876–1883. https://doi.org/10.1121/1.1903526
Van Berkum, J. J. A., Van Den Brink, D., Tesink, C. M. J. Y., Kos, M. and Hagoort, P. (2008) The neural integration of speaker and message. Journal of Cognitive Neuroscience 20: 580–591. http://dx.doi.org/10.1162/jocn.2008.20054
Van den Heuvel, H. (1996) Speaker Variability in Acoustic Properties of Dutch Phoneme Realisations. Nijmegen: Radboud Universiteit.
Van der Vloed, D., Kelly, F. and Alexander, A. (2020) Exploring the effects of device variability on forensic speaker comparison using VOCALISE and NFI-FRIDA, a forensically realistic database. In Proceedings of the Odyssey Speaker and Language Recognition Workshop 402–407. Tokyo: ISCA.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S (4th ed.). New York: Springer
Viszlay, P., Juhár, J. and Pleva, M. (2012) Alternative phonetic class definition in linear discriminant analysis of speech. In 19th International Conference on Systems, Signals and Image Processing 637–640. Vienna: IEEE.
Voeten, C. (2020) buildmer: Stepwise Elimination and Term Reordering for Mixed-Effects Regression. R package version 1.5.