Automated Writing Analysis for Writing Pedagogy
From Healthy Tension to Tangible Prospects
Keywords:AES, AWE, writing pedagogy, assessment, genre, writing praxis
This article aims to engage specialists in writing pedagogy, assessment, genre study, and educational technologies in a constructive dialog and joint exploration of automated writing analysis as a potent instantiation of computer-enhanced assessment for learning. It recounts the values of writing pedagogy and, from this perspective, examines legitimate concerns with automated writing analysis. Emphasis is placed on the need to substantiate the construct-driven debate with systematic empirical evidence that would corroborate or refute interpretations, uses, and consequences of automated scoring and feedback tools intended for specific contexts. Such evidence can be obtained by adopting a validity argument framework. To demonstrate an application of this framework, the article presents a novel genre-based approach to automated analysis configured to support research writing and provides examples of validity evidence for using it with novice scholarly writers.
Attali, Y. (2004). Exploring the feedback and revision features of the Criterion service. Paper presented at the National Council on Measurement in Education Annual Meeting, San Diego, CA. Retrieved on November 12, 2007 at http://www.ets.org/Media/Research/pdf/erater_NCME_2004_Attali_B.pdf.
Attali, Y. (2013). Probable cause: Validity and reliability of automated scoring. In M.D. Shermis & J. Burstein (Eds.), Handbook of Automated Essay Evaluation: Current applications and new directions (pp.181-198). Routledge, New York.
Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater® V.2. Journal of Technology, Learning, and Assessment, 4(3). Available from http://ejournals.bc.edu/ojs/index.php/jtla.
Attali, Y., & Powers, D. (2008). Effect of immediate feedback and revision on psychometric properties of open-ended GRE Subject Test items. GRE Board Research Report No GRE-04-05. Princeton, NJ: ETS.
Bachman, L. F. (1990). Fundamental considerations in language testing. New York: Oxford University Press.
Bakhtin, M. M. (1981). The dialogic imagination: Four essays by M.M. Bakhtin. Austin: University of Texas Press.
Bereiter, C., & Scardamalia, M. (1987). The psychology of written composition. Hillsdale, NJ: Lawrence Erlbaum Associates.
Berninger, V., & Swanson, H.L. (1994). Modifying Hayes and Flowers’ model of skilled writing to explain beginning and developing writing. In E. Butterfield (ed.), Children's Writing; Toward a Process Theory of Development of Skilled Writing (pp. 57-81). Greenwich, CT: JAI Press.
Berlin, J.A. (1988). Rhetoric and ideology in the writing class. College English, 50, 477-494.
Bloom, L.Z. (2003). The great paradigm shift and its legacy for the twenty-first century. In N. Elliot & L. Perelman (Eds.), Writing assessment in the 21st century: Essays in honor of Edward M. White (pp. 31-47). New York, NY: Hampton Press.
Brown, S., & Knight, P. (1994). Assessing Learners in Higher Education. Kogan Page, London.
Burstein, J. (2012). Fostering best practices in writing instruction and assessment with E-rater®. In Norbert Elliott and Les Perelman (Eds.), Writing Assessment in the 21st Century—Essays in Honor of Edward M. White (pp. 203–217). Creskill, NJ: Hampton Press.
Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The Criterion online writing service. AI Magazine, 25(3), 27-36.
Burstein, J., Tetreault, J., & Madnani, N. (2013). The e-rater automated essay scoring system. In M. D. Shermis, & J. Burstein (Eds.), Handbook of automated essay scoring: Current applications and future directions (pp. 55–67). New York: Routledge.
Byrne, R., Tang, M., Truduc, J., & Tang, M. (2010). eGrader, a software application that automatically scores student essays: With a postscript on the ethical complexities. Journal of Systemics, Cybernetics & Informatics, 8(6), 30-35.
Chapman, M. (2013). Review of the book Writing Assessment in the 21st Century: Essays in Honor of Edward M. White by N. Elliot & L. Perelman. Assessing Writing, 18, 182-185.
Chen, C. F, & Cheng, W. Y., (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning and Technology, 12(2), 94-112.
Chenoweth, A., & Hayes, J. (2001). Fluency in writing: Generating text in L1 and L2. Written Communication, 18, 80-98.
Cheville, J. (2004). Automated scoring technologies and the rising influence of error. English Journal, 93(4), 47-52.
Chung, G., & Baker, E. (2003). Issues in the reliability and validity of automated scoring of constructed responses. In M. D. Shermis & J. C. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 23-40). Mahwah, NJ: Lawrence Associates.
Condon, W. (2011). The mechanization of writing assessments with technologies. In M. R. Neal (Ed.), Writing assessment and the revolution in digital texts and technologies (pp. 59-75). New York, NY: Teachers College Press.
Condon, W. (2013). Large-scale assessment, locally-developed measures, and automated scoring of essays: Fishing for red herrings? Assessing Writing, 18, 100-108.
Conference on College Composition and Communication (2004). CCCC Position Statement on Teaching, Learning, and Assessing Writing in Digital Environments. Retrieved December 23, 2014, from http://www.ncte.org/cccc/resources/positions/digitalenvironments.
Conference on College Composition and Communication (2009). Writing Assessment: A Position Statement. Retrieved December 23, 2014, from http://www.ncte.org/cccc/resources/positions/digitalenvironments.
Cope, B. & Kalantzis, M. (eds.). (1993). The Powers of Literacy: A Genre Approach to Teaching Writing. Bristol, PA: Falmer Press.
Cortes, V. (2007). Genre and corpora in the English for academic writing class. ORTESOL Journa1, 25, 9-16.
Deane, P. (2013). On the relation between automated essay scoring and modern views of the writing construct. Assessing Writing, 18, 7–24.
Deane, P., Quinlan, T., Odendahl, N., Welsh, C., & Bivens-Tatum, J. (2008). Cognitive models of writing: Writing proficiency as a complex integrated skill. CBAL literature review writing (ETS Research Report No. RR-08-55). Princeton, NJ: ETS.
Dikli, S. (2006). An overview of automates scoring of essays. The Journal of Technology, Learning, and Assessment, 5(1), 4-35.
Ebel, R. L., & Frisbie, D. A. (1991). Essentials of Educational Measurement (5th ed.). Prentice Hall, Inc. Englewood Cliffs, NJ.
El Ebyary, K., & Windeatt, S. (2010). The impact of computer-based feedback on students’ written work. International Journal of English Studies, 10(2), 121-142.
Elliot, N. (2013). Assessing Writing special issue: Assessing writing with automated scoring systems. Assessing Writing, 18, 1–6.
Elliott, S. (2011). Computer-graded essays full of flaws. Dayton Daily News (May 24). Retrieved December 23, 2014 from http://www.daytondailynews.com/project/content/project/tests/0524testautoscore.html.
Elliot, N., & Klobucar, A. (2013). Automated Essay Evaluation and the teaching of writing. In M. D. Shermis & J. Burstein (Eds.), Handbook of Automated Essay Evaluation: Current applications and new directions (pp.16-35). Routledge, New York.
Elliot, S., & Mikulas, C. (2004). The impact of MY Access!™ use on student writing performance: A technology overview and four studies. Paper presented at the Annual Meeting of the American Educational Research Association. San Diego, CA.
Engeström, Y. (1987). Learning by expanding. Helsinki: Orienta-Konsultit.
Ericsson, P. (2006). The meaning of meaning: Is a paragraph more than an equation? In P. F. Ericsson & R. Haswell (Eds.), Machine scoring of student essays: Truth and consequences (pp. 28-37). Logan, UT: Utah State University Press.
Ericsson & R. Haswell (Eds.), Machine scoring of student essays: Truth and consequences (pp. 28-37). Logan, UT: Utah State University Press.
Ferris, D. R., & Hedgcock, J. S. (2014). Teaching L2 composition: Purpose, process, and practice (3rd ed.). New York, NY: Routledge/Taylor & Francis.
Flower, L., & J.R. Hayes. (1981). A cognitive process theory of writing. College Composition and Communication, 32(4), 365-87.
Foltz, P., Laham, D., & Landauer, T. (1999). The Intelligent Essay Assessor: Applications to Educational Technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Leaning, 1(2). Retrieved on December 8, 2008 at http://imej.wfu.edu/articles/1999/2/04/index.asp.
Garner, M., & Borg, E. (2005). An ecological perspective on content-based instruction. Journal of English for Academic Purposes, 4, 119-134.
Gass, S., & Mackey, A. (2006). Input, interaction and output: An overview. AILA Review, 19, 3-17.
Gor, K., & Long, M. H. (2009). Input and second language processing. In W. C.Ritchie & T. J.Bhatia (Eds.), Handbook of second language acquisition (pp. 445-472). New York: Academic Press.
Grimes, D., & Warschauer, M. (2008). Learning with laptops: A multi-method case study. Journal of Educational Computing Research, 38(3), 305-332.
Haswell, R. (2006). Automatons and automated scoring: Drudges, black boxes, and dei ex machina. In P. Ericsson and R. Haswell (Eds.), Machine scoring of student essays: Truth and consequences (pp. 57-78). Logan, UT: Utah State University Press.
Halliday, M. A. K. (1978). Language as a social semiotic: The social interpretation of language and meaning. London: Edward Arnold.
Halliday, M. A. K. (1985). An introduction to functional grammar. London: Edward Arnold.
Halliday, M. A. K., & Hasan, R. (1989). Language, context, and text: Aspects of language in a social-semiotic perspective. Oxford: Oxford University Press.
Hayes, J. R. & Flower, L. (1983). A cognitive model of the writing process in adults. National Institute of Education (ED). Washington, DC.
Herrington, A., & Moran, C. (2012). Writing to a machine is not writing at all. In N. Elliot & L. Perelman (Eds.), Writing assessment in the 21st century: Essays in honor of Edward M. White (pp. 219-232). New York, NY: Hampton Press.
Horning, A. (2002). Revision Revisited. Cresskill, NJ: Hampton Press.
Hyland, K. (2003). Second language writing. Cambridge, UK: Cambridge University Press.
Hyland, K., & Hamp-Lyons, L. (2002). EAP: issues and directions. Journal of English for Academic Purposes, 1, 1-12.
Johns, A.M. (2011). The future of genre in L2 writing: Fundamental, but contested, instructional decisions. Journal of Second Language Writing, 20, 56-68.
Jones, E. (2006). ACCUPLACER’S essay-scoring technology: When reliability does not equal validity. In: P. F. Ericsson & R. H. Haswell (Eds.), Machine scoring of student essays: Truth and consequences (pp. 93-113). Logan, UT: Utah State University Press.
Kane, M. (2006). Validation. In R. Brennen (Ed.), Educational measurement, 4th ed. (pp. 17-64). Westport, CT: Greenwood.
Keith, T.Z. (2003). Validity and automated essay scoring systems. In M.D. Shermis & J.C. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective. (pp. 147-67). Mahwah, NJ: Lawrence Erlbaum.
Klobucar, A., Deane, P., Elliot, N., Raminie, C., Deess, P., & Rudniy, A. (2012). Automated essay scoring and the search for valid writing assessment. In Charles Bazerman et al. (Eds.) International Advances in Writing Research: Cultures, Places, Measures (pp. 103-119). Fort Collins, CO: WAC Clearinghouse & Parlor Press.
Knowles, E. (2011). Out of the box: A review of Ericsson and Haswell's (Eds.) Machine Scoring of Student Writing: Truth and Consequences. Journal of Writing Assessment, 4(1).
Kostouli, T. (2009). A sociocultural framework: writing as social practice. In R. Beard, D. Myhill, M. Nystrand, and J. Riley. (Eds.). The SAGE Handbook of Writing Development (pp. 98-116). London: Sage.
Landauer, T., Laham, D., & Foltz, P. (2003). Automated Scoring and Annotation of Essays with the Intelligent Essay Assessor. In Shermis, M.D. and Burstein, J.C. (Eds.), Automated essay scoring: A cross-disciplinary perspective. Mahwah, NJ: Lawrence Erlbaum Associates.
Lea, M., & Street, B. (2006). The ‘Academic Literacies’ model: theory and applications. Theory into Practice, 45(4), 368-377.
Li, Z., Link, S., & Hegelheimer, V. (2015). Rethinking the role of automated writing evaluation (AWE) feedback in ESL writing instruction. Journal of Second Language Writing, 27, 1-18.
Link, S., Dursun, A., Karakaya, K. & Hegelheimer, V. (2014). Towards best ESL practices for implementing Automated Writing Evaluation. CALICO Journal, 31(3), 323-344.
Long, M. (1996). The role of the linguistic environment in second language acquisition. In W. Ritchie & T. K. Bhatia (Eds.), Handbook of language acquisition: Vol. 2. Second language acquisition (pp. 413-468). San Diego, CA: Academic Press.
Long, M. H. (2007). Problems in SLA. Mahwah, New Jersey: Erlbaum.
Matsuda, P. K. (2003a). Process and post-process: A discursive history. Journal of Second Language Writing, 12, 65-83.
Matsuda, P. K. (2003b). Second language writing in the twentieth century: A situated historical perspective. In B. Kroll (Ed.), Exploring the dynamics of second language writing (pp. 15-34). New York: Cambridge University Press.
McCurry, D. (2010). Can machine scoring deal with broad and open writing tests as well as human readers? Assessing Writing, 15(2), 118-129.
Messick, S. (1989). Validity. Macmillan: American Council on Education.
Miller, C. R. (1984). Genre as social action. Quarterly Journal of Speech, 70, 151-167.
Miller, C. R. (1994). Rhetorical community: The cultural basis of genre. In A. Freedman & P. Medway (Eds.), Genre and the new rhetoric (pp. 67-78). London: Taylor & Francis.
National Council of Teachers of English, (NCTE, 2013). NCTE Position Statement on Machine Scoring: Machine Scoring Fails the Test. Retrieved December 23, 2014 from http://www.ncte.org/positions/statements/machine_scoring
Neal, M. R. (2011). Writing assessment and the revolution in digital technologies. New York, NY: Teachers College Press.
Page, E.B. (1994). New computer grading of student prose, using modern concepts and software. Journal of Experimental Education, 62(2), 127-142.
Page, E. B., Keith, T., & Lavoie, M. J. (1995). Construct validity in the computer grading of essays. Paper presented at the Annual Meeting of the American Psychological Association. New York, NY.
Page, E. B. & Petersen, N. S. (1995). The computer moves into essay grading: Updating the ancient test. Phi Delta Kappan, 76(7), 561–565.
Perelman, L. (2012a). Mass-market writing assessments as bullshit. In N. Elliot & L. Perelman (Eds.), Writing assessment in the 21st century: Essays in honor of Edward M. White (pp. 425-438). New York, NY: Hampton Press.
Perelman, L. (2012b). Construct validity, length, score, and time in holistically graded writing assessments: The case against automated essay scoring (AES). In: C. Bazerman, C. Dean, J. Early, K. Lunsford, S. Null, P. Rogers, & A. Stansell (Eds.), International advances in writing research: Cultures, places, measures (pp. 121-131). Fort Collins, Colorado: WAC Clearinghouse/Anderson.
Polio, C., & Williams, J. (2011). Teaching and testing Writing. In M.H. Long & C.J. Doughty (Eds.), Handbook of language teaching (pp. 486-517). Malden, MA: Blackwell.
Ramineni, C., & Williamson, D.M. (2013). Automated essay scoring: Psychometric guidelines and practices. Assessing Writing, 18, 25-39.
Rich, C.S, Schneider, M.C, & D’Brot, J.M. (2013). Applications of Automated Essay Evaluation in West Virginia. In M.D. Shermis & J. Burstein (Eds.), Handbook of Automated Essay Evaluation: Current applications and new directions (pp. 99-123). Routledge, New York.
Russell, D.R. (1995). Activity theory and its implications for writing instruction. In J. Petraglia (Ed.), Reconceiving writing, rethinking writing instruction (pp. 51-77). Mahwah, NJ: Lawrence Erlbaum.
Russell, D.R. (2010). Writing in multiple contexts: Vygotskian CHAT meets the phenomenology of genre. In C. Bazerman, R. Krut, K. Lunsford , S. McLeod, S. Null, P. Rogers, et al., (Eds.), Traditions of writing research (pp. 353-364). New York: Routledge.
Schroeder, J., Grohe, B., & Pogue, R. (2008). The impact of Criterion writing evaluation technology on criminal justice student writing skills. The Journal of Criminal Justice Education, 19(3), 432-445.
Shermis, M.D. & Burstein, J.C. (2013). (Eds). Handbook of Automated essay Evaluation: Current applications and new directions. Routledge, New York.
Shermis, M. D., & Hamner, B. (2013). Contrasting State-of-the-Art Automated Scoring of Essays. In M. D. Shermis, & J. Burstein (Eds.), Handbook of Automated Essay Evaluation (pp. 213-246). New York, NY: Routeledge.
Spinuzzi, C. (2004). Four ways to investigate assemblages of texts: Genre sets, systems, repertoires, and ecologies. SIGDOC '04 Proceedings of the 22nd annual international conference on Design of communication: The engineering of quality documentation (pp. 110-116).
Swales, J.M. (1981). Aspects of articles introductions. Aston ESP Reports, No. 1. The University of Aston in Birmingham.
Swales, J.M. (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press.
Swales. J. M. (2011). Aspects of Article Introductions. Ann Arbor: University of Michigan Press.
Swales, J. M. & Feak, C. B. (2000). English in today’s research world: A writing guide. Ann Arbor, MI: University of Michigan Press.
Torrance, H., & Pryor, J. (1998). Investigating Formative Assessment. Teaching, Learning and Assessment in the Classroom. Buckingham, Open University Press.
Vojak, C., Kline, S., Cope, B., McCarthey, S., & Kalantzis, M. (2011). New Spaces and Old Places: An Analysis of Writing Assessment Software. Computers and Composition, 28(2), 97-111.
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.
Ware, P. (2011). Computer-generated feedback on student writing. TESOL Quarterly, 45, 769–774.
Wardle, E., & Roozen, K. (2012). Addressing the complexity of writing development: Toward an ecological model of assessment. Assessing Writing, 17(2), 106-119.
Warschauer, M., & Ware, P. (2006). Automated writing evaluation: defining the classroom research agenda. Language Teaching Research, 10(2), 1-24.
Weigle, S. C. (2010). Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability. Language Testing, 27(3), 335-353.
Weigle, S.C. (2013a). English language learners and automated scoring of essays: Critical considerations. Assessing Writing, 18, 85-99.
Weigle, S.C. (2013b). English as a second language writing and Automated Essay Evaluation. In M.D. Shermis & J. Burstein (Eds.), Handbook of Automated Essay Evaluation: Current applications and new directions (pp. 36-54). Routledge, New York.
Whithaus, C. (2006). Always already: Automated essay scoring and grammar checkers in college writing courses. In P.E. Ericsson & R. Haswell (Eds.), Machine scoring of student essays: Truth and consequences (pp. 166-176). Logan, UT: Utah State University Press.
Williamson, D.M. (2013). Probable cause: Developing warrants for automated scoring. In M.D. Shermis & J. Burstein (Eds.), Handbook of Automated Essay Evaluation: Current applications and new directions (pp. 153-180). Routledge, New York.
Williamson, D.M., Xi, X., & Breyer, F. J. (2012). A framework for evaluation and use of automated scoring. Educational Measurement: Issues and Practice, 31(1), 2-13.
Zamel, V. (1976). Teaching composition in the ESL classroom: What we can learn from research in the teaching of English. TESOL Quarterly, 10, 67-76.
Zhu, W. (2010). Theory and practice in second language writing: How and where do they meet? In T. Silva & P.K. Matsuda (Eds.), Practicing theory in second language writing (pp. 209-228). West Lafayette, IN: Parlor Press.
How to Cite
© Equinox Publishing Ltd.
For information regarding our Open Access policy, click here.