From Pen to Pixel

Rethinking English Language Proficiency Admissions Assessments in the Digital Era




technology-mediated language use, postsecondary international students, digital technologies in higher education, generative AI tools, English language proficiency tests


The use of digital technologies in higher education is continually increasing, leading to changes in language use and presumably altering the language skills needed for academic studies. However, scores from high-stakes English language proficiency (ELP) tests used in postsecondary admissions only ensure the prerequisite level of traditional English skills (reading, writing, listening, and speaking). Such tests generally do not directly assess technology-mediated language skills (e.g., using online dictionaries and communicating via text message) that likely facilitate successful degree completion for international students. We present results from a needs analysis survey to re-evaluate the English-medium postsecondary linguistic landscape (i.e., update the target language use domain description), to inform ELP admissions tests. We specifically investigate international student (n = 379) and disciplinary instructor (n = 427) perceptions of the importance and frequency of technology-mediated language skills. Results show that student and instructor responses differ on certain technology-mediated skills, such as typing on a smartphone, underscoring the need to consider diverse perspectives in domain analysis research. Findings may inform how digital ELP admissions tests are developed, and how English for academic purposes curricula are designed, in order to better align test/classroom tasks with the academic language skills postsecondary students need.

Author Biographies

  • Ramsey Cardwell , Duolingo

    Ramsey Cardwell is a Senior Assessment Scientist at Duolingo, where he conducts validation research for the Duolingo English Test. He previously worked in K12 and licensure testing and taught English as a foreign language. He holds a PhD in Educational Measurement from the University of North Carolina at Greensboro and an MSc in Quantitative Psychology from McGill University.

  • Ben Naismith, Duolingo

    Ben Naismith is a Senior Assessment Scientist at Duolingo where he works on research and development of the Duolingo English Test. Ben has worked in numerous contexts as a teacher, teacher trainer, materials developer, assessment specialist, and researcher. He holds a PhD in Applied Linguistics from the University of Pittsburgh.

  • Jill Burstein, Duolingo

    Jill Burstein is a Principal Assessment Scientist at Duolingo where she conducts assessment innovation research for the Duolingo English Test. As a leader in AI in Education, Dr. Burstein has led NLP teams that invented automated writing evaluation systems used in high-stakes, large-scale assessment, and online writing support applications. She has published widely and holds many patents for this work. More recently, she wrote responsible AI standards for the Duolingo English Test, the first such standards for an assessment program. She is a co-founder of SIG EDU, an ACL Special Interest Group on Building Educational Applications. Dr. Burstein holds a Ph.D. in Linguistics from the Graduate Center, City University of New York.

  • Steven Nydick, Duolingo

    Steven Nydick is a Lead Psychometrician at Duolingo where he contributes to the scoring and administration logic of the Duolingo English Test. Steven has over 10 years of experience in psychometrics, assessment development, statistics, data science, and programming. He has a PhD in Quantitative Psychology/Psychometric Methods and an MS in Statistics, both from the University of Minnesota.

  • Sarah Goodwin, Duolingo

    Sarah Goodwin, Ph.D. (Applied Linguistics, Georgia State University), Senior Assessment Scientist at Duolingo, specializes in language testing, item development, and measurement. She has taught English and linguistics courses and developed tests for diverse populations including multilingual learners and people with disabilities. Sarah’s research focuses on academic English proficiency, listening comprehension, and technology in language assessment. Her work also appears in Assessing Writing, Language Assessment Quarterly, and Frontiers in Artificial Intelligence.

  • Anthony Verardi, Duolingo

    Anthony Verardi is a Research Program Manager for the Duolingo English Test. They hold an MA in Applied Linguistics from the University of Pittsburgh. His research interests include second language acquisition and pedagogy, sociolinguistics (particularly raciolinguistics and the effects of gender & sexuality on language), and computational linguistics.


American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME) (2014). Standards for educational and psychological testing. Washington, D.C.: AERA.

Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford: Oxford University Press.

Banerjee, J. V. (2003). Interpreting and using proficiency test scores. PhD thesis, University of Lancaster.

British Association of Lecturers in English for Academic Purposes (BALEAP) (2013). BALEAP can-do framework.

Burstein, J., Elliot, N., & Molloy, H. (2016). Informing automated writing evaluation using the lens of genre: Two studies. CALICO Journal, 33(1), 117–141.

Cardwell, R. L., Naismith, B., LaFlair, G. T., & Nydick, S. (2023). Duolingo English Test: Technical manual. Duolingo Research Report.

Chalhoub–Deville, M., & Deville, C. (1999). Computer adaptive testing in second language contexts. Annual Review of Applied Linguistics, 19, 273–299.

Chapelle, C. A. (2012). Validity argument for language assessment: The framework is simple … Language Testing, 29(1), 19–27.

Costley, T., & Leung, C. (2020). Putting translanguaging into practice: A view from England. System, 92, 102270.

Council of Europe (2001). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge: Press Syndicate of the University of Cambridge.

Council of Europe (2020). Common European Framework of Reference for Languages: Learning, teaching, assessment–Companion volume. Strasbourg: Council of Europe Publishing.

Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (2002). Dictionary of language testing (reprint). Cambridge: Cambridge University Press.

De Jong, J., & Benigno, V. (2016). The CEFR in higher education: Developing descriptors of academic English. Conference presentation, Language Testing Forum, University of Reading.

Dimova, S., Yan, X., & Ginther, A. (2022). Local tests, local contexts. Language Testing, 39(3), 341–354.

Dursun, A. (2023). Domain analysis as a multidimensional research framework: Evidence-based alignment for LSP research, assessment, and curricula. Global Business Languages, 23, 1–13.

Educational Testing Service (ETS) (2020). TOEFL program history. TOEFL® Research Insight Series (vol. 6). Princeton: ETS.

Educational Testing Service (ETS) (2023). TOEFL iBT paper edition test content. Princeton: ETS.

Fox, J., Cheng, L., Berman, R., Song, X., & Myles, J. (2006). Costs and benefits: English for academic purposes instruction in Canadian universities. Carleton Papers in Applied Language Studies, 23, 1–108.

González-Lloret, M., & Ortega, L. (2014). Towards technology-mediated TBLT. In M. González-Lloret & L. Ortega (Eds.), Technology-mediated TBLT: Researching technology and tasks (pp. 1–22). Amsterdam: John Benjamins.

Guikema, J. P., & Williams, L (Eds.) (2014). Digital literacies in foreign and second language education. San Marcos: Computer Assisted Language Instruction Consortium (CALICO).

Hyland, K., & Shaw, P. (Eds.) (2016). The Routledge handbook of English for academic purposes. Abingdon: Routledge.

Ihlenfeldt, S. D., & Rios, J. A. (2023). A meta-analysis on the predictive validity of English language proficiency assessments for college admissions. Language Testing, 40(2), 276–299.

Kern, R. (2021). Twenty-five years of digital literacies in CALL. Language Learning & Technology, 25(3), 132–150.

Khabbazbashi, N., Chan, S., & Clark, T. (2023). Towards the new construct of academic English in the digital age. ELT Journal, 77(2), 207–216.

Kyle, K., Eguchi, M., Choe, A. T., & LaFlair, G. (2022). Register variation in spoken and written language use across technology-mediated and non-technology-mediated learning environments. Language Testing, 39(4), 618–648.

Ordinate Corporation (1998). PhonePass test validation report. Palo Alto: Ordinate Corporation.

Palmour, L. (2024). Assessing speaking through multimodal oral presentations: The case of construct underrepresentation in EAP contexts. Language Testing, 41(1), 9–34.

Poe, M. (2023). Everyday writing innovation for the design of 21st century digital writing assessments. AAAL Symposium Series: Language Assessment for the 21st Century, March 18, Portland.

Randall, J. (2021). “Color-neutral” is not a thing: Redefining construct definition and representation through a justice-oriented critical antiracist lens. Educational Measurement: Issues and Practice, 40(4), 82–90.

Randall, J., Slomp, D., Poe, M., & Oliveri, M. E. (2022). Disrupting white supremacy in assessment: Toward a justice-oriented, antiracist validity framework. Educational Assessment, 27(2), 170–178.

R Core Team (2023). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.

Read, J. (2022). Test review: The International English Language Testing System (IELTS). Language Testing, 39(4), 679–694.

Rosenfeld, M., Leung, S., & Oltman, P. K. (2001). The reading, writing, speaking, and listening tasks important for academic success at the undergraduate and graduate levels (pp. 1–104). TOEFL Monograph Series RM-01-03. Princeton: ETS.

Tsetsi, E., & Rains, S. A. (2017). Smartphone internet access and use: Extending the digital divide and usage gap. Mobile Media & Communication, 5(3), 239–255.

Tweedie, M. G., & Kim, M. (2016). EAP curriculum alignment and social acculturation: Student perceptions. TESL Canada Journal, 33(1), 41–57.

Weir, C., Hawkey, R., Green, A., Unaldi, A., & Devi, S. (2009). The relationship between the academic reading construct as measured by IELTS and the reading experiences of students in their first year of study at a British university (pp. 97–156). IELTS Research Report 9. IELTS Australia, British Council.

Open access logo






How to Cite

Cardwell , R., Naismith, B., Burstein, J., Nydick, S., Goodwin, S., & Verardi, A. (2024). From Pen to Pixel: Rethinking English Language Proficiency Admissions Assessments in the Digital Era. CALICO Journal, 41(2), 209-234.