Unregistered words in everyday language and a proposal for their optimal lexicographic microstructure
DOI:
https://doi.org/10.1558/lexi.26357Keywords:
unregistered words, instant messenger corpus, microstructure, lexicography, neologismsAbstract
This article looks into lexicographic adaptation to media change. Instant messengers in Korea function as the most popular communication medium. According to the latest survey by Gallup Korea, instant messengers are used by 92% of the population overall. It means that the instant messenger corpus provides an ideal resource for accessing the language of the masses from a corpus linguistic point of view. In this contribution, we analyze an instant messenger corpus of 1.4 million words, and look into the prevalent unregistered words in the corpus to propose a microstructural model for them. Section 2 introduces the normalized parallel corpus of Messenger used in this study, and discusses the extraction methodology for unregistered words. We discuss the operational definition of unregistered words for dictionary inclusion and their extraction process. Section 3 examines the prevalence of unregistered words in the defined Messenger corpus and categorizes them based on the characteristics of messenger language. These characteristics encompass deviations from the pre-existing writing system, deviations from linguistic norms, deviations from socio-ethical criteria, incidental omissions, and non-verbal expressions. Section 4 proposes an optimal lexicographical structure incorporating unregistered words and their characteristics identified in the previous sections. Additionally, we discuss the extension and modification of microstructures in existing dictionaries, which could be made to effectively represent this new medium’s language.
References
Baron, N. S. (2010). Always on: Language in an online and mobile world. Oxford: Oxford University Press.
Bauer, L. (2001). Morphological productivity (vol. 95). Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511486210
Cho, J. (2021). Korean dictionary compilation history. Seoul: Hangeulhakhoy.
Cho, T., Kim, S., Yu, H., Kim, Y., and Lee, J. (2018). A basic study for ethical linguistic expressions of interactive artificial intelligence: Focusing on classification of unethical linguistic expressions as word units. Eomunhak, 140, 65–96. https://doi.org/10.37967/emh.2018.06.140.65
Do, W. (2017). A lexical study of Korean exclamation in early 20th century spoken corpus. Journal of Korean Culture, 36, 49–74.
Jeon, S. (2011). Media and communication from the viewpoint of the philosophy of education: Focusing on the language of oral age, literate age and electronic age. Korean Journal of Philosophy of Education, 33(3), 113–134. https://doi.org/10.15754/jkpe.2011.33.3.006
Lee, Y. (1936). Hangeul movement: How does the compilation of a Korean dictionary work? Hangeul, 31; Narasarang, 13, 128–131.
McCarthy, M., and Carter, R. (2006). Ten criteria for a spoken grammar. Explorations in Corpus Linguistics, 27, 27–52.
McCulloch, G. (2019). Because internet: Understanding the new rules of language. London: Penguin Books.
Nam, K. (2022). Between orality and literacy, a study on discourse and grammar in instant messages – focusing on the comparison between the instant messenger corpus and the spoken corpus. Eomunyeongu, 50, 5–34.
Nam, K., Lee, S., and Choi, J. (2018). Research trends and issues on semantic neology using web corpus. Journal of Korealex, 31, 55–84. https://doi.org/10.33641/kolex.2018..31.55
Nam, K., Lee, S., and Jung, H. Y. (2020). The Korean Neologism Investigation Project: Current status and key issues. Dictionaries: Journal of the Dictionary Society of North America, 41(1), 105–129. https://doi.org/10.1353/dic.2020.0007
Nam, K., Song, H., Kim, J., Huang, Y., and Ahn, E. (2021). Analysis of the 2021 normalized spelling corpus. Seoul: National Institute of Korean Language (NIKL).
Nam, K., and Huang, Y. (2023). A study on language of the third medium – Messenger. Hangugeohak, 99, 1–30. https://doi.org/10.20405/kl.2023.05.99.1
Ong, W. J. (1982). Orality and literacy. London: Routledge. https://doi.org/10.4324/9780203328064
Schmid, H.-J. (2008). New words in the mind: Concept-formation and entrenchment of neologisms. Anglia, 126(1), 1–36. https://doi.org/10.1515/angl.2008.002
Smyk-Bhattacharjee, D. (2009). Lexical innovation on the internet: Neologisms in blogs. Doctoral dissertation, University of Zurich.
Websites
Google. http://google.co.kr
Hankuk Gallup (2022). Usage of 18 types of media, content, and social network services. Market, 70(2). https://www.gallup.co.kr/gallupdb/reportContent.asp?seqNo=1323
Dictionaries
Merriam-Webster Dictionary (online). https://www.merriam-webster.com
Oxford Learner’s Dictionaries (online). Oxford: Oxford University Press. https://www.oxfordlearnersdictionaries.com
Standard Korean Language Dictionary. Seoul: National Institute of Korean Language (NIKL). https://stdict.korean.go.kr
Urban Dictionary. https://www.urbandictionary.com
Urimalsaem (online). Seoul: NIKL. https://opendict.korean.go.kr/main
Corpora
Korean National Corpus in the 21st Century Sejong Project (2015 revised). Seoul: NIKL.
Normalized spelling corpus (2021). Seoul: NIKL. https://corpus.korean.go.kr