Unregistered words in everyday language and a proposal for their optimal lexicographic microstructure


  • Yinxia Huang Pai Chai University
  • Kilim Nam Kyungpook National University




unregistered words, instant messenger corpus, microstructure, lexicography, neologisms


This article looks into lexicographic adaptation to media change. Instant messengers in Korea function as the most popular communication medium. According to the latest survey by Gallup Korea, instant messengers are used by 92% of the population overall. It means that the instant messenger corpus provides an ideal resource for accessing the language of the masses from a corpus linguistic point of view. In this contribution, we analyze an instant messenger corpus of 1.4 million words, and look into the prevalent unregistered words in the corpus to propose a microstructural model for them. Section 2 introduces the normalized parallel corpus of Messenger used in this study, and discusses the extraction methodology for unregistered words. We discuss the operational definition of unregistered words for dictionary inclusion and their extraction process. Section 3 examines the prevalence of unregistered words in the defined Messenger corpus and categorizes them based on the characteristics of messenger language. These characteristics encompass deviations from the pre-existing writing system, deviations from linguistic norms, deviations from socio-ethical criteria, incidental omissions, and non-verbal expressions. Section 4 proposes an optimal lexicographical structure incorporating unregistered words and their characteristics identified in the previous sections. Additionally, we discuss the extension and modification of microstructures in existing dictionaries, which could be made to effectively represent this new medium’s language.

Author Biographies

  • Yinxia Huang, Pai Chai University

    Yinxia Huang is an assistant professor in linguistics and has been teaching since 2011 at the Department of Korean Language, Literature, and Education at Pai Chai University, South Korea. She received her doctoral degree in corpus linguistics from the Yonsei University, Seoul. She is currently a board member of KOREALEX. Her major interests are corpus linguistics, contrastive linguistics, and lexicography.

  • Kilim Nam, Kyungpook National University

    Kilim Nam is a professor at the Department of Korean Language and Literature at Kyungpook National University (Daegu, South Korea). She holds a PhD in Korean linguistics (on the copula ida structures in contemporary Korean, 2004) from Yonsei University (Seoul). She is currently a board member of KOREALEX and president of ASIALEX. She has been the principal investigator of the Korean Neologism Investigation Project since 2012. Her research focuses on corpus linguistics and language performance.


Baron, N. S. (2010). Always on: Language in an online and mobile world. Oxford: Oxford University Press.

Bauer, L. (2001). Morphological productivity (vol. 95). Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511486210

Cho, J. (2021). Korean dictionary compilation history. Seoul: Hangeulhakhoy.

Cho, T., Kim, S., Yu, H., Kim, Y., and Lee, J. (2018). A basic study for ethical linguistic expressions of interactive artificial intelligence: Focusing on classification of unethical linguistic expressions as word units. Eomunhak, 140, 65–96. https://doi.org/10.37967/emh.2018.06.140.65

Do, W. (2017). A lexical study of Korean exclamation in early 20th century spoken corpus. Journal of Korean Culture, 36, 49–74.

Jeon, S. (2011). Media and communication from the viewpoint of the philosophy of education: Focusing on the language of oral age, literate age and electronic age. Korean Journal of Philosophy of Education, 33(3), 113–134. https://doi.org/10.15754/jkpe.2011.33.3.006

Lee, Y. (1936). Hangeul movement: How does the compilation of a Korean dictionary work? Hangeul, 31; Narasarang, 13, 128–131.

McCarthy, M., and Carter, R. (2006). Ten criteria for a spoken grammar. Explorations in Corpus Linguistics, 27, 27–52.

McCulloch, G. (2019). Because internet: Understanding the new rules of language. London: Penguin Books.

Nam, K. (2022). Between orality and literacy, a study on discourse and grammar in instant messages – focusing on the comparison between the instant messenger corpus and the spoken corpus. Eomunyeongu, 50, 5–34.

Nam, K., Lee, S., and Choi, J. (2018). Research trends and issues on semantic neology using web corpus. Journal of Korealex, 31, 55–84. https://doi.org/10.33641/kolex.2018..31.55

Nam, K., Lee, S., and Jung, H. Y. (2020). The Korean Neologism Investigation Project: Current status and key issues. Dictionaries: Journal of the Dictionary Society of North America, 41(1), 105–129. https://doi.org/10.1353/dic.2020.0007

Nam, K., Song, H., Kim, J., Huang, Y., and Ahn, E. (2021). Analysis of the 2021 normalized spelling corpus. Seoul: National Institute of Korean Language (NIKL).

Nam, K., and Huang, Y. (2023). A study on language of the third medium – Messenger. Hangugeohak, 99, 1–30. https://doi.org/10.20405/kl.2023.05.99.1

Ong, W. J. (1982). Orality and literacy. London: Routledge. https://doi.org/10.4324/9780203328064

Schmid, H.-J. (2008). New words in the mind: Concept-formation and entrenchment of neologisms. Anglia, 126(1), 1–36. https://doi.org/10.1515/angl.2008.002

Smyk-Bhattacharjee, D. (2009). Lexical innovation on the internet: Neologisms in blogs. Doctoral dissertation, University of Zurich.


Google. http://google.co.kr

Hankuk Gallup (2022). Usage of 18 types of media, content, and social network services. Market, 70(2). https://www.gallup.co.kr/gallupdb/reportContent.asp?seqNo=1323


Merriam-Webster Dictionary (online). https://www.merriam-webster.com

Oxford Learner’s Dictionaries (online). Oxford: Oxford University Press. https://www.oxfordlearnersdictionaries.com

Standard Korean Language Dictionary. Seoul: National Institute of Korean Language (NIKL). https://stdict.korean.go.kr

Urban Dictionary. https://www.urbandictionary.com

Urimalsaem (online). Seoul: NIKL. https://opendict.korean.go.kr/main


Korean National Corpus in the 21st Century Sejong Project (2015 revised). Seoul: NIKL.

Normalized spelling corpus (2021). Seoul: NIKL. https://corpus.korean.go.kr



How to Cite

Huang, Y., & Nam, K. (2023). Unregistered words in everyday language and a proposal for their optimal lexicographic microstructure. Lexicography, 10(2), 94-116. https://doi.org/10.1558/lexi.26357