Lexical markup framework
an ISO standard for electronic lexicons and its implications for Asian languages
Keywords:Machine-readable dictionaries, Natural language processing, Asian languages, ISO, UML
Lexical markup framework (LMF) is the ISO standard for representing machine-readable dictionaries (MRD) and natural language-processing lexicons. The formal specification has been officially published in 2008 under the reference ISO24613:2008 after a 5-year study and series of meetings gathering 60 lexicon managers and linguists coming from various cultures and languages. The ISO document contains a formal specification under the form of a Unified Modeling Language accompanied with a selection of examples of word description in Asian, European, Semitic and Turkish languages. Afterward, the model has been applied to a couple of African languages. In this current text, after a brief introduction of LMF, we present some difficult challenges which are required to represent a selection of Asian languages, especially in the context of dictionaries in general and MRD in particular.
Antoni-Lay, M. H., G. Francopoulo, and L. Zaysser. 1994. A generic model for reusable lexicons: The Genelex Project. In Literary and linguistic computing, eds. N. Ostler, A. Zampolli, 9(1): 47–54.
Boguraev, B., E. J. Briscoe, C. Calzolari, A. Cater, W. Meijs, and A. Zampolli. 1988. Acquisition of lexical knowledge for natural language processing systems (ACQUILEX), Proposal for ESPRIT Basic Research Actions No. 3030. Cambridge (UK).
Calzolari, N., M. Monachini, and C. Soria. 2013. LMF—historical context and perspectives. In LMF—Lexical Markup Framework, ed. G. Francopoulo. London: ISTE/Wiley.
Chung, S., T. Jiang, K. Hasan, S. Lee, I. Su, L. Prevot, and C. Huang. 2007. Extending an international lexical framework for Asian languages, the case of Mandarin, Taiwanese, Cantonese, Bangla and Malay. In Proceedings of the first international workshop on intercultural collaboration (IWIC).
Kyoto: Kyoto University, January 24–26.
Francopoulo G. ed. 2013. LMF—Lexical Markup Framework. London: ISTE/Wiley.
Hocker, C. 1954. Two models of grammatical description. Word 10: 210–234.
Huang, C., K. Chen, and C. Lai. eds. 1997. Mandarin Daily Dictionary of Chinese Classifiers. (???????) Taipei: Mandarin Daily Press.
Lee, L., S. Hsieh, and C. Huang. 2009. Cwn-Lmf: Chinese Wordnet in the Lexical Markup Framework. In Presented at the 7th Workshop on Asian Language Resources (ALR7), ACL-IJCNLP 2009. Singapore, August 2–9.
Shirai, K., T. Takunaga, T. Huang, S. Hsieh, L. Huo, V. Sornlertlamvanich, and T. Charoenporn. 2008. Constructing Taxonomy of Numerative Classifiers for Asian Languages. In Proceeding of the 3rd international joint conference on natural language processing (IJCNLP), Hyderabad, India, 2008.
Tokunaga, T., V. Sornlertlamvanich, T. Chareonporn, N. Calzolari, M. Monachini, C. Soria, C. Huang, Y. Xia, H. Yu, L. Prevot, and K. Shirai. 2006. Infrastructure for standardization of Asian language resources. In Presented at the 2006 COLING/ACL Joint Conference. Sydney, Australia. July 17–21.
Tokunaga, T., D. Kaplan, N. Calzolari, M. Monachini, C. Soria, V. Sornlertlamvanich, T. Charoenporn, Y. Xia, C. Huang, S. Hsieh, and K. Shirai. 2009. Query expansion using Lmf-compliant lexical resources. In Presented at the 7th Workshop on Asian Language Resources (ALR7), ACL-IJCNLP
Singapore, August 2–9.
Tokunaga, T., S. Y. M. Lee, V. Sornlertlamvanich, K. Shirai, S. Hseih, and C. Huang. 2013. LMF and its implementation in some Asian languages, In LMF—Lexical Markup Framework, ed. G. Francopoulo. London: ISTE/Wiley.
Yu, Y., L. Lee, S. Hsieh, and C. Huang. 2009. Chinese word sense distinction in the Lexical Markup Framework: A study in environmental domain. In Presented at Chinese Lexical Semantics Workshop (CLSW) 2009. Yantai, China, July 27–31.