Visualizing Linguistic Complexity and Proficiency in Learner English Writings


  • Thomas Gaillat University of Rennes
  • Antoine Lafontaine French National Research Institute for Health, Environment and Health
  • Anas Knefati Orange Business France



linguistic complexity, L2 English, automatic essay feedback, visualization


In this article, we focus on the design of a second language (L2) formative feedback system that provides linguistic complexity graph reports on the writings of English for special purposes students at the university level. The system is evaluated in light of formative instruction features pointed out in the literature. The significance of complexity metrics is also evaluated. A learner corpus of English classified according to the Common European Framework of References for Languages (CEFR) was processed using a pipeline that computes 83 complexity metrics. By way of analysis of variance (ANOVA) testing, multinomial logistic regression, and clustering methods, we identified and validated a set of nine significant metrics in terms of proficiency levels. Validation with classification gave 67.51% (A level), 60.16% (B level), and 60.47% (C level) balanced accuracy. Clustering showed between 53.10% and 67.37% homogeneity, depending on the level. As a result, these metrics were used to create graphical reports about the linguistic complexity of learner writing. These reports are designed to help language teachers diagnose their students’ writings in comparison with prerecorded cohorts of different proficiencies.

Author Biographies

  • Thomas Gaillat, University of Rennes

    Thomas Gaillat is Associate Professor in corpus linguistics at the University of Rennes in France, where he also teaches English for specific purposes. He is a member of the LIDILE research team. He received his doctorate in 2016 at the University of Sorbonne Paris Cité, graduating summa cum laude. His thesis focused on corpus interoperability as a method to explore how this, that, and it, as referential forms, are used by learners of English. His publications cover linguistic questions intersecting the domains of natural language processing, corpus linguistics, and statistics. His main research axis is focused on language acquisition questions. He is the principal investigator of a project focused on analytics for language learning (A4LL). This project is funded by the French National Research Agency (Agence nationale de la recherche, ANR).

  • Antoine Lafontaine, French National Research Institute for Health, Environment and Health

    Antoine Lafontaine is an engineer statistician who worked on educational data for the Data Tank at the National School for Statistics and Data Analysis (École nationale de la statistique et de l’analyse de l’information, ENSAI) from 2019 to 2021. The Data Tank specializes in learning analytics and pedagogical innovations. Since 2021, Antoine has been working with the ELIXIR team at the French National Research Institute for Health, Environment and Work (Institut de recherche en santé, environnement et travail, IRSET) (Inserm UMR_S 1085), and more specifically on the CONSTANCES cohort, a large general population-based epidemiological cohort in France.

  • Anas Knefati, Orange Business France

    Anas Knefati holds a PhD in machine learning, and has previously worked as a research engineer specializing in machine learning for the Data Tank operation at the National School for Statistics and Data Analysis (ENSAI) from 2019 to 2021. During his time there, Anas contributed to the Data Tank’s focus on learning analytics and pedagogical innovations. Currently, he is working at Orange Business, where he is dedicated to industrializing artificial intelligence solutions.


Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater® V.2. Journal of Technology, Learning and Assessment, 4(3), 3–29.

Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.

Ballier, N., Canu, S., Petitjean, C., Gasso, G., Balhana, C., …, & Gaillat, T. (2020). Machine learning for learner English. International Journal of Learner Corpus Research, 6(1), 72–103.

Benoit, K., Watanabe, K., Wang, H., Nulty, P., Obeng, A., …, & Matsuo, A. (2018). quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774.

Biber, D., Gray, B., Staples, S., & Egbert, J. (2020). Investigating grammatical complexity in L2 English writing research: Linguistic description versus predictive measurement. Journal of English for Academic Purposes, 46, 100869.

Breiman, L., & Spector, P. (1992). Submodel selection and evaluation in regression. The X-random case. International Statistical Review/Revue Internationale de Statistique, 60(3), 291–319.

Bulté, B., & Housen, A. (2012). Defining and operationalising L2 complexity. Amsterdam: John Benjamins Publishing Company.

Council of Europe (2018). Common European Framework of Reference for Languages: Learning, teaching, assessment—Companion volume. Strasbourg: Council of Europe.

Crossley, S. A., Kyle, K., & Dascalu, M. (2019). The Tool for the Automatic Analysis of Cohesion 2.0: Integrating semantic similarity and text overlap. Behavior Research Methods, 51(1), 14–27.

Dascalu, M., Dessus, P., Trausan-Matu, S., Bianco, M., Nardy, A., …, & Trausan-Matu, S. (2013). ReaderBench, an environment for analyzing text complexity and reading strategies. In H. C. Lane, K. Yacef, J. Mostow, & P. Pavlik (Eds.), Artificial intelligence in education (pp. 379–388). AIED 13. Lecture Notes in Computer Science vol. 7926. Berlin & Heidelberg: Springer.

Dougiamas, M., & Taylor, P. (2003). Moodle: Using learning communities to create an open source course management system. Proceedings of the EDMEDIA 2003 Conference, Honolulu, Hawaii, 171–178.

Eisinga, R., Grotenhuis, M. te, & Pelzer, B. (2013). The reliability of a two-item scale: Pearson, Cronbach, or Spearman-Brown? International Journal of Public Health, 58(4), 637–642.

Ellis, R., Loewen, S., & Erlam, R. (2006). Implicit and explicit corrective feedback and the acquisition of L2 grammar. Studies in Second Language Acquisition, 28(2), 339–368.

Gaillat, T. (in press). Investigating the scope of textual metrics for learner level discrimination and learner analytics. In A. Lenko-Szymanska & S. Götz (Eds.), Complexity, accuracy and fluency in learner corpus research. John Benjamins.

Gaillat, T., Janvier, P., Dumont, B., Lafontaine, A., Knefati, A., …, & Hamon, C. (2019, December). CELVA.Sp: A corpus for the visualisation of linguistic profiles in language learners. PERL 2019, December, Paris.

Gaillat, T., Simpkin, A., Ballier, N., Stearns, B., Sousa, A., …, & Zarrouk, M. (2021). Predicting CEFR levels in learners of English: The use of microsystem criterial features in a machine learning approach. ReCALL, 34(2).

Granger, S. (2015). The contribution of learner corpora to reference and instructional materials design. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge 196 Visualizing Linguistic Complexity and Proficiency handbook of learner corpus research (pp. 485–510). Cambridge: Cambridge University Press.

Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A K-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1), 100–108.

Hawkins, J. A., & Buttery, P. (2010). Criterial features in learner corpora: Theory and illustrations. English Profile Journal, 1(1).

Housen, A., Kuiken, F., & Vedder, I. (Eds.). (2012). Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA. Amsterdam: John Benjamins Publishing Company.

Kyle, K. (2016). Measuring syntactic development in L2 writing: Fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication. Dissertation, Georgia State University.

Kyle, K., Crossley, S., & Berger, C. (2018). The Tool for the Automatic Analysis of Lexical Sophistication (TAALES): Version 2.0. Behavior Research Methods, 50(3), 1030–1046.

Lai, C., & Li, G. (2011). Technology and task-based language teaching: A critical review. CALICO Journal, 28(2), 498–521.

Leacock, C., Chodorow, M., & Tetreault, J. (2015). Automatic grammar- and spell-checking for language learners. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 567–586). Cambridge: Cambridge University Press.

Levshina, N. (2015). How to do linguistics with R: Data exploration and statistical analysis. Amsterdam: John Benjamins Publishing Company.;

Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474–496.

Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners’ oral narratives. Modern Language Journal, 96(2), 190–208.

Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. In K. Bontcheva & J. Zhu (Eds.), Proceedings of the 52nd annual meeting of the Association for Computational Linguistics: System demonstrations (pp. 55–60). Association for Computational Linguistics.

McNamara, D. S., Boonthum, C., Levinstein, I., & Millis, K. (2007). Evaluating self-explanations in iSTART: Comparing word-based and LSA algorithms. In T. K. Landauer (Ed.), Handbook of latent semantic analysis (pp. 227–241). Mahwah: Lawrence Erlbaum Associates Publishers.

McNamara, D. S., Louwerse, M. M., McCarthy, P. M., & Graesser, A. C. (2010). Coh-Metrix: Capturing linguistic features of cohesion. Discourse Processes, 47(4), 292–330.

Meurers, W. D. (2009). On the automatic analysis of learner language: Introduction to the special issue. CALICO Journal, 26(3), 469–473.

Pilán, I., & Volodina, E. (2018). Investigating the importance of linguistic complexity features across different datasets related to language learning. In L. Becerra-Bonache, M. D. Jiménez-López, C. Martín-Vide, & A. Torrens-Urrutia (Eds.), Proceedings of the Workshop on linguistic complexity and natural language processing (pp. 49–58). Association for Computational Linguistics.

Pilán, I., Volodina, E., & Zesch, T. (2016). Predicting proficiency levels in learner writings by transferring a linguistic complexity model from expert-written coursebooks. In Y. Matsumoto & R. Prasad (Eds.), Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical papers (pp. 2101–2111). COLING 16 Organizing Committee.

Roscoe, R. D., Allen, L. K., Weston, J. L., Crossley, S. A., & McNamara, D. S. (2014). The Writing Pal intelligent tutoring system: Usability testing and development. Computers and Composition, 34, 39–59.

Rudzewitz, B., Ziai, R., Nuxoll, F., Kuthy, K. D., & Meurers, W. D. (2019). Enhancing a web-based language tutoring system with learning analytics. In L. Paquette & C. Romero (Eds.), Joint proceedings of the Workshops of the 12th International Conference on Educational Data Mining co-located with the 12th International Conference on Educational Data Mining, EDM 2019 Workshops (pp. 1–7). CEUR Workshop Proceedings vol. 2592. Aachen: CEUR-WS.

Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189.

Tack, A., François, T., Roekhaut, S., & Fairon, C. (2017). Human and automated CEFR-based grading of short answers. In J. Tetreault, J. Burstein, Cl Leacock, & H. Yannakoudakis (Eds.), Proceedings of the 12th Workshop on innovative use of NLP for building educational applications (pp. 169–179). Association for Computational Linguistics.

Vajjala, S. (2018). Automated assessment of non-native learner essays: Investigating the role of linguistic features. International Journal of Artificial Intelligence in Education, 28, 79–105.

Vajjala, S., & Loo, K. (2014). Automatic CEFR level prediction for Estonian learner text. NEALT Proceedings Series, 22, 113–128.

Vajjala, S., & Meurers, D. (2012). On improving the accuracy of readability classification using insights from second language acquisition. In J. Tetrault, J. Burstein, & C Leacock (Eds.), Proceedings of the 7th Workshop on building educational applications using NLP(pp. 163–173). Association for Computational Linguistics.

Wolfe-Quintero, K., Inagaki, S., & Kim, H.-Y. (1998). Second language development in writing: Measures of fluency, accuracy, and complexity. Second Language Teaching & Curriculum Center, University of Hawaii at Manoa.

Yannakoudakis, H., Briscoe, T., & Medlock, B. (2011). A new dataset and method for automatically grading ESOL texts. In D. Lin, Y. Matsumoto, & R. Mihalcea (Eds.), Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human language technologies (pp. 180–189). Association for Computational Linguistics.

Yannakoudakis, H., Andersen, Ø. E., Geranpayeh, A., Briscoe, T., & Nicholls, D. (2018). Developing an automated writing placement system for ESL learners. Applied Measurement in Education, 31(3), 251–267.






How to Cite

Gaillat, T., Lafontaine, A., & Knefati, A. (2023). Visualizing Linguistic Complexity and Proficiency in Learner English Writings. CALICO Journal, 40(2), 178-197.