Who owns your voice? Linguistic and legal perspectives on the relationship between vocal distinctiveness and the rights of the individual speaker

Dominic Watt; Peter S. Harrison; Lily Cabot-King

doi:10.1558/ijsll.40571

Authors

Dominic Watt University of York
Peter S. Harrison University of York
Lily Cabot-King University of York

DOI:

https://doi.org/10.1558/ijsll.40571

Keywords:

Voice, intellectual property, copyright, fraud, speech technology

Abstract

Only in very recent times has the concept of ‘ownership’ of a human voice begun to demand proper consideration in terms of its legal implications. The current lack of clarity with respect to the rights afforded to individuals and organisations in this area is something that must be addressed as a matter of some urgency, given that voice samples are now collected on an unprecedented scale, with or without the knowledge or consent of the person(s) who produced the captured speech. In this article we explore the issue of voice ownership from a variety of perspectives, starting with an attempt to define what the voice actually is, and then considering how representations of a talker’s voice at greater or lesser levels of concreteness (or ‘tangibility’) can be misappropriated and misused in unethical or unlawful ways.

Author Biographies

Dominic Watt, University of York

Dominic Watt is Senior Lecturer in Forensic Speech Science at the University of York, UK. His research interests include forensic linguistics and phonetics, speech perception, sociophonetics, dialectology, and language and identity studies. He is Co-Investigator on the UK Economic and Social Research Council-funded projects 'The Use and Utility of Localised Speech Forms in Determining Identity: Forensic and Sociophonetic Perspectives' (2016-19, ES/M010783/1) and 'Accent Bias and Fair Access in Britain' (2017-20, ES/P007767/1). He is co-editor of 'The Handbook of Dialectology' (Wiley, 2017) and 'Language and Identities' (Edinburgh University Press, 2010), and undertakes occasional forensic casework on behalf of JP French Associates, York.
Peter S. Harrison, University of York

Peter Harrison (FHEA, FRSA) is Lecturer in Law at York Law School, University of York, UK. His background is in physiology and pharmaceutical science, and he has PhDs in both pharmacology and law. He qualified as a Solicitor of the Senior Courts of England & Wales in 1997, after which he worked in intellectual property (IP) litigation and exploitation in the UK and Canada, becoming practice head for IP at an international law firm. He is an elected Associate of the Chartered Institute of Patent Attorneys (UK). His main research interests lie at the interface between biological innovation and IP protection, with a focus on the justifiable scope of indigenous sui generis rights in traditional therapeutic knowledge and genetic resources to prevent their misappropriation. He is also active in research on the role of patents and other IP rights in the incentivisation and governance of technology innovation.
Lily Cabot-King, University of York

Lily Cabot-King graduated with an LLB from York Law School in 2018, with intellectual property as her final-year specialisation. Her dissertation research was on intellectual property of installation art. She was awarded the Tanya Walker Prize for Clinic by York Law School in 2017. The Hogan Lovells student enhancement activities funding she was awarded in 2018 supported the research she conducted for the present paper. Lily is bilingual in English and French.

References

Akinyokun, N. and Teague, V. (2017) Security and privacy implications of NFC-enabled contactless payment systems. Proceedings of the 12th International Conference on Availability, Reliability and Security (ARES), Reggio Calabria, Italy, September 2017, Article 47. DOI: 10.1145/3098954.3103161.

Armerding, T. (2018) The 18 biggest data breaches of the 21st century. CSOonline, 20th December 2018. www.csoonline.com/article/2130877 Accessed 28 January 2020.

Belin, P., Fecteau, S. and Bédard, C. (2004) Thinking the voice: Neural correlates of voice perception. Trends in Cognitive Sciences 8(3): 129–135.

Blair, O. (2017) Australian radio DJ says royal prank call to Duchess of Cambridge’s hospital contributed to her divorce. The Independent, 27th April 2017. https://www.independent.co.uk/life-style/australian-radio-dj-mel-greig-bbc-royal-prank-call-duchess-cambridge-hospital-nurse-jacintha-a7706186.html Accessed 28 January 2020.

Bohm, A., George, E., Cyphers, B. and Lu, S. (2017) Privacy and liberty in an always-on, always-listening world. Columbia Science and Technology Law Review 19: 1-45. http://www.stlr.org/download/volumes/volume19/Bohm.pdf Accessed 28 January 2020.

Bolt, R.H., Cooper, F.S., David, E.E., Denes, P.B., Pickett, J.M. and Stevens, K.N. (1970) Speaker identification by speech spectrograms: A scientists’ view of its reliability for legal purposes. Journal of the Acoustical Society of America 47(2): 597–612.

Caron, C. (2006) Un nouveau droit voisin est né: Le droit patrimonial sur l’image. Communication Commerce électronique, January 2006, no. 1.

https://lexis360.lexisnexis.fr/droit-document/article/communication-commerce-electronique/01-2006/004_PS_CCE_CCE0601CM00004.htm#.XjAsjVP7TUJ Accessed 28 January 2020.

Cecchin, L.M. (1993) Waits v. Frito-Lay, Inc. and Tracy-Locke, Inc. 978 F.2d 1093 (9th Cir. 1992). DePaul Journal of Art, Technology and Intellectual Property Law 3(2): 88–90.

CereVoice (2020) CereProc’s Bot or Not (online quiz). https://cerevoice.com/apps/botornot/ Accessed 28 January 2020.

Chen, S., Ren, K., Piao, S., Wang, C., Wang, Q., Weng, J., Su, L. and Mohaisen, A. (2017) You can hear but you cannot steal: Defending against voice impersonation attacks on smartphones. Proceedings of the 37th Institute of Electrical and Electronics Engineers (IEEE) International Conference on Distributed Computing Systems, Atlanta, June 2017, pp. 183-195. https://cse.buffalo.edu/~lusu/papers/ICDCS2017Si.pdf Accessed 28 January 2020.

Chouiter, L. and Annoni, J.-M. (2018) Glossolalia and aphasia: Related but different worlds. In J. Bogousslavsky (ed) Neurologic-Psychiatric Syndromes in Focus, Part II: From Psychiatry to Neurology 96–105. Basel: Karger.

Cowling, M. (2015) It’s not just your TV listening in to your conversation. The Conversation, 10th February 2015. http://theconversation.com/its-not-just-your-tv-listening-in-to-your-conversation-37409 Accessed 28 January 2020.

Crime Survey for England and Wales (2017) Crime in England and Wales: Year Ending December 2017. https://www.gov.uk/government/statistics/crime-in-england-and-wales-year-ending-december-2017 Accessed 28 January 2020.

Crime Survey for England and Wales (2019) Crime in England and Wales: Year Ending September 2019. https://www.ons.gov.uk/peoplepopulationandcommunity/crimeandjustice/bulletins/crimeinenglandandwales/yearendingseptember2019 Accessed 28 January 2020.

Davies, C. (2014) Jacintha Saldanha ‘took blame’ for Duchess of Cambridge prank call. The Guardian, 11th September 2014. https://www.theguardian.com/world/2014/sep/11/jacintha-saldanha-took-blame-prank-call-duchess-cambridge-australian-djs-inquest Accessed 28 January 2020.

Dogan, S.L. (2014) Haelan Laboratories v. Topps Chewing Gum: Publicity as a legal right. In R. Cooper Dreyfuss and J.C. Ginsburg (eds) Intellectual Property at the Edge: The Contested Contours of IP 17–38. Cambridge: Cambridge University Press.

Eckert, H. and Laver, J. (1994) Menschen und ihre Stimmen: Aspekte der vokalen Kommunikation. Weinheim: Beltz PVU.

Feng, H., Fawaz, K. and Shin, K. (2017). Continuous authentication for voice assistants. Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, Snowbird, UT, October 2017, pp. 343–355.

Gold, E. and French, P. (2019). International practices in forensic speaker comparisons: Second survey. International Journal of Speech, Language and the Law 26(1): 1–20.

Greenslade, R. (2011). News of the World pays footballer £70,000 for libel. The Guardian, 19th June 2011. https://www.theguardian.com/media/greenslade/2011/jun/19/newsoftheworld-medialaw Accessed 28 January 2020.

Harrington, J. (2006) An acoustic analysis of ‘happy-tensing’ in the Queen’s Christmas broadcasts. Journal of Phonetics 34(4): 439–457.

Hashim, N.W., Wilkes, M., Salomon, R., Meggs. J. and France, D.J. (2017) Evaluation of voice acoustics as predictors of clinical depression scores. Journal of Voice 31(2): 256.e1–256.e6.

Hollien, H. (1990) The Acoustics of Crime: The New Science of Forensic Phonetics. New York: Springer.

HSBC UK (2020) Telephone banking: Keep on top of your money by phone. https://www.hsbc.co.uk/ways-to-bank/phone-banking/ Accessed 28 January 2020.

Ihalainen, J. (2018) That sounds good – do you have IP rights in your own voice? IP Iustitia, 31st January 2018. https://www.ipiustitia.com/2018/01/that-sounds-good-do-you-have-ip-rights.html Accessed 28 January 2020.

Jones, R. (2018) Voice recognition: is it really as secure as it sounds? The Guardian, 22nd September 2018. https://www.theguardian.com/money/2018/sep/22/voice-recognition-is-it-really-as-secure-as-it-sounds Accessed 28 January 2020.

Kasuya, H. and Yoshida, H. (2017) Age-related changes in the human voice. In K. Makiyama and S. Hirano (eds) Aging Voice 27–36. New York: Springer.

Kersta, L.G. (1962) Voiceprint identification. Nature 196: 1253–1257.

Kersta, L.G. (1966) Speaker recognition and identification by voiceprints. Connecticut Bar Journal 40(4): 586–593.

Keulen, S., Mariën, P., van Dun, K., Bastiaanse, R., Manto, M. and Verhoeven, J. (2017) The posterior fossa and Foreign Accent Syndrome: Report of two new cases and review of the literature. The Cerebellum 16(4): 772–785.

Kreiman, J. and Sidtis, D. (2011) Foundations of Voice Studies: An Interdisciplinary Approach to Voice Production and Perception. Oxford: Wiley.

Künzel, H.J. (2010) Automatic speaker recognition of identical twins. International Journal of Speech, Language and the Law 17(2): 251–277.

Ladefoged, P. (1971) Preliminaries to Linguistic Phonetics. Chicago: University of Chicago Press.

Lavan, N., Burton, A.M., Scott, S.K. and McGettigan, C. (2019) Flexible voices: Identity perception from variable vocal signals. Psychonomic Bulletin and Review 26(1): 90–102.

Laver, J. (2017). Linguistic phonetics: The sounds of languages. In M. Aronoff and J. Rees-Miller (eds) The Handbook of Linguistics, 2nd edn 159–184. Oxford: Wiley.

Loakes, D. (2006) Variation in long-term fundamental frequency: Measurements from vocalic segments in twins’ speech. Proceedings of the 11th Australian International Conference on Speech Science and Technology, Auckland, December 2006, pp. 205–210.

Lefranc, D. (2014) Do the French have their own “Haelan” case? The droit à l’image as an emerging intellectual property right. In R. Cooper Dreyfuss and J.C. Ginsburg (eds) Intellectual Property at the Edge: The Contested Contours of IP 39–56. Cambridge: Cambridge University Press.

Lepage, A. (2005) Protection de la voix à travers celle de la vie privée. Communication Commerce électronique, May 2005, no. 5: 48.

https://lexis360.lexisnexis.fr/droit-document/article/communication-commerce-electronique/05-2005/092_PS_CCE_CCE0505CM00092.htm#.XjAtqFP7TUJ Accessed 28 January 2020.

Logeais, P. and Schroeder, J.-P. (1998) The French right of image: an ambiguous concept protecting the human persona. Loyola of Los Angeles Entertainment Law Review 18(3): 511–542.

McClelland, E. (2011) Impersonation in forensic casework: Case of Tommy Sheridan. Paper presented at the annual conference of the International Association for Forensic Phonetics and Acoustics, Vienna, Austria, July 2011. https://www.kfs.oeaw.ac.at/publications/iafpa_abstracts/nr26_elizabethmcclelland.pdf Accessed 28 January 2020.

McGettigan, C. and Lavan, N. (2017). Human voices are unique – but our study shows we’re not that good at recognising them. The Conversation, 16th June 2017. https://theconversation.com/human-voices-are-unique-but-our-study-shows-were-not-that-good-at-recognising-them-79520 Accessed 28 January 2020.

McGorrery, P.G. and McMahon, M. (2017) A fair ‘hearing’: Earwitness identifications and voice identification parades. International Journal of Evidence and Proof 21(3): 262–286.

McLeod, K. and DiCola, P. (2011) Creative License: The Law and Culture of Digital Sampling. Durham, NC: Duke University Press.

Mermelstein, P. (1976) Distance measures for speech recognition, psychological and instrumental. In C.H. Chen (ed) Pattern Recognition and Artificial Intelligence 374–388. New York: Academic Press.

Miller, D. (2016) Voice Biometrics Census: Steady Growth of Global Enrollments. San Francisco: Opus Research, Inc. https://www.nuance.com/content/dam/nuance/en_au/collateral/enterprise/report/ar-opus-vb-census-en-us.pdf Accessed 28 January 2020.

Mirra, N. (2018) Putting words in your mouth: The evidentiary impact of emerging voice editing software. Richmond Journal of Law and Technology 25(1): 1–30.

Morrison, G.S., Sahito, F.H., Jardine, G., Djokic, D., Clavet, S., Berghs, S. and Dorny, C.G. (2016) INTERPOL survey of the use of speaker identification by law enforcement agencies. Forensic Science International 263: 92–100.

Mukhopadhyay, D., Shirvanian, M. and Saxena, N. (2015) All your voices are belong to us: Stealing voices to fool humans and machines. In G. Pernul, P.Y.A. Ryan and E. Weippl (eds) Computer Security – ESORICS 2015 (20th European Symposium on Research in Computer Security, Vienna, September 2015), Proceedings, Part II, pp. 599–621.

Nolan, F., McDougall, K. and Hudson, T. (2011) Some acoustic correlates of perceived (dis)similarity between same-accent voices. In W.-S. Lee and E. Zee (eds) Proceedings of the 17th International Congress of Phonetic Sciences, Hong Kong, August 2011, pp. 1506–1509.

Nuance Communications (2018) Nuance Forensics: Prosecute Criminals using their Voice. Burlington, MA: Nuance Communications. https://www.nuance.com/content/dam/nuance/en_us/collateral/enterprise/data-sheet/ds-nuance-forensics-en-us.pdf Accessed 28 January 2020.

Paulsen, F., Kimpel, M., Lockemann, U. and Tillmann, B. (2000) Effects of ageing on the insertion zones of the human vocal fold. Journal of Anatomy 196: 41–54.

Peachey, K. (2019) HMRC forced to delete five million voice files. BBC News, 3rd May 2019. https://www.bbc.co.uk/news/business-48150575 Accessed 28 January 2020.

Ping, W., Peng, K., Gibiansky, A., Arık, S.O., Kannan, A., Narang, S., Raiman, J. and Miller, J. (2018) Deep Voice 3: Scaling text-to-speech with convolutional sequence learning. Proceedings of the 6th International Conference on Learning Representations (ICLR), Vancouver, April-May 2018. https://arxiv.org/pdf/1710.07654.pdf Accessed 28 January 2020.

Potter, R.K., Kopp, G.A. and Green, H.C. (1947) Visible Speech. New York: Van Nostrand.

Rhodes, R. (2017) Aging effects on voice features used in forensic speaker comparison. International Journal of Speech, Language and the Law 24(2): 177–199.

Roberts, L. (2011) Acoustic effects of authentic and acted distress on fundametal frequency and vowel quality. Proceedings of the 17th International Congress of Phonetic Sciences, Hong Kong, August 2011, pp. 1694–1697.

Robson, J. (2017) A fair hearing? The use of voice identification parades in criminal investigations in England and Wales. Criminal Law Review 1: 36–50.

San Segundo, E. and Mompean, J.A. (2017) A simplified Vocal Profile Analysis protocol for the assessment of voice quality and speaker similarity. Journal of Voice 31(5): 644.e11–644.e27.

Sanjith, R. (2017) BBC prank of HSBC voice ID system overlooks millions of successful authentications. Opusresearch.net, 22nd May 2017. https://opusresearch.net/wordpress/2017/05/22/bbc-prank-of-hsbc-voice-id-system-overlooks-millions-of-successful-authentications/ Accessed 28 January 2020.

Scherer, K. (1995) Expression of emotion in voice and music. Journal of Voice 9(3): 235–248.

Sewell, A. (2014) How copyright affected the musical style and critical reception of sample-based hip-hop. Journal of Popular Music Studies 26(2–3): 295–320.

Shirvanian, M., Saxena, N. and Mukhopadhyay, D. (2018) Short voice imitation man-in-the-middle attacks on Crypto Phones: Defeating humans and machines. Journal of Computer Security 26(3): 311–333.

Sidtis, D. and Kreiman, J. (2012) In the beginning was the familiar voice: Personally familiar voices in the evolutionary and contemporary biology of communication. Integrative Psychological and Behavioral Science 46(2): 146–159.

Simmons, D. (2017). BBC fools HSBC voice recognition security system. BBC News, 19th May 2017. https://www.bbc.co.uk/news/technology-39965545 Accessed 28 January 2020.

Singh, R., Jiménez, A. and Øland, A. (2017) Voice disguise by mimicry: Deriving statistical articulometric evidence to evaluate claimed impersonation. IET Biometrics 6(4): 282–289.

Smith, K.M. and Caplan, D.N. (2018) Communication impairment in Parkinson’s disease: Impact of motor and cognitive symptoms on speech and language. Brain and Language 185: 38–46.

Sundberg, J. (1987) The Science of the Singing Voice. DeKalb, IL: Northern Illinois University Press.

Tafforeau, P. (2007) ‘L’être et l’avoir ou la patrimonialisation de l’image des personnes’ Communication Commerce électronique, Mai 2007, no. 5. https://lexis360.lexisnexis.fr/droit-document/article/communication-commerce-electronique/05-2007/009_PS_CCE_CCE0705ET00009.htm#.XjAuy1P7TUI Accessed 28 January 2020.

Torre, P. and Barlow, A.J. (2009) Age-related changes in acoustic characteristics of adult speech. Journal of Communication Disorders 42(5): 324–333.

Tosi, O., Oyer, H.J., Lashbrook, W., Pedney, C., Nichol, J. and Nash, W. (1972) Experiment on voice identification. Journal of the Acoustical Society of America 51(6): 2030–2043.

Veaux, C., Yamagishi, J. and King, S. (2015) A comparison of manual and automatic voice repair for individual with vocal disabilities. Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT 2015), Dresden, September 2015, pp. 130–133.

Walker, T. (2017) How local accents have replaced Stephen Hawking-style voiceboxes. The Guardian, 6th February 2017.

https://www.theguardian.com/society/shortcuts/2017/feb/06/local-accents-stephen-hawkings-voiceboxes-motor-neurone-disease Accessed 28 January 2020.

Watt, D. and Brown, G. (2020) The forensic application of Automatic Speaker Recognition technology. To appear in M. Coulthard, A. Johnson and R. Sousa-Silva (eds) The Routledge Handbook of Forensic Linguistics, 2nd edn. London: Routledge.

Wenndt, S.J. (2016) Human recognition of familiar voices. Journal of the Acoustical Society of America 140: 1172–1183.

Wester, M., Wu, Z. and Yamagishi, J. (2015) Human vs. machine spoofing detection on wideband and narrowband data. Proceedings of the 16th Annual Conference of the International Speech Communication Association (Interspeech), Dresden, September 2015, pp. 2047–2051.

Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F. and Li, H. (2015) Spoofing and countermeasures for speaker verification: A survey. Speech Communication 66: 130–153.

Yorkshire Post (2009) Gambling addict murdered dad and left his body in bin bags. Yorkshire Post, 20th February 2009. https://www.yorkshirepost.co.uk/news/latest-news/gambling-addict-murdered-dad-and-left-his-body-in-bin-bags-1-2335486 Accessed 28 January 2020.

Zetterholm, E. (2007) Detection of speaker characteristics using voice imitation. In C. Müller (ed) Speaker Classification II: Selected Projects 192–205. Berlin: Springer.

Who owns your voice? Linguistic and legal perspectives on the relationship between vocal distinctiveness and the rights of the individual speaker

Authors

DOI:

Keywords:

Abstract

Author Biographies

References

Downloads

Published

Issue

Section

License

How to Cite

Subscription

Information

Accessibility

Unsubscribe

Latest publications