Detection Systems for Text-Based Plagiarism

Developments, Principles, Challenges, and the Aftermath


  • Wilfried Decoo Brigham Young University
  • Jozef Colpaert University of Antwerp



plagiarism, plagiarism detection, svum scale


The growing attention given to plagiarism in the Internet Age has triggered the development and marketing of scores of antiplagiarism services and devices. This article deals only with text-based plagiarism. We first mention some of the current developments in plagiarism detection systems. Next we briefly describe principles and challenges of such systems. Finally, we outline what can happen after a system reports possible plagiarism.

Author Biographies

Wilfried Decoo, Brigham Young University

Wilfried Decoo, who holds a PhD from Brigham Young University and a PhD from the Belgian Interuniversity Commission, is Professor of French and of applied linguistics at Brigham Young University (USA) and at the University of Antwerp (Belgium). He is the author of language learning textbooks and articles on language pedagogy, including Crisis on Campus: Confronting Academic Misconduct (MIT Press, 2002) and Systemization in Foreign Language Teaching: Monitoring Content Progression (Routledge, 2010).

Jozef Colpaert, University of Antwerp

Jozef Colpaert, who holds a PhD from the University of Antwerp, is Associate Professor of e-learning and educational engineering at the University of Antwerp and Director of Research and Development at Linguapolis, the Institute for Language and Communication. He is editor-in-chief of the journal, Computer Assisted Language Learning, and his publications focus on educational engineering as research and goal-oriented design of learning environments.


Barrón-Cedeño, A., Rosso, P., and Benedí, J.-M. (2009) Reducing the plagiarism detection search space on the basis of the Kullback-Leibler distance. In A. F. Gelbukh (ed.) Computational Linguistics and Intelligent Text Processing 523–534. Berlin: Springer.

Butakov, S. and Scherbinin, V. (2009) The toolbox for local and global plagiarism detection. Computers and Education 52(4): 781–788.

Carnevale, D. (1999) Information technology: Web services help professors detect plagiarism. The Chronicle of Higher Education 46: A49.

Ceska, Z. and Fox, C. (2009) The influence of text pre-processing on plagiarism detection. Paper presented at RANLP – Recent Advances in Natural Language Processing / September 14–16, 2009. Borovets, Bulgaria: RANLP.

Ceska, Z., Toman, M., and Jezek, K. (2008) Multilingual plagiarism detection. In AIMSA”08: Proceedings of the 13th International Conference on Artificial Intelligence 83–92. Berlin: Springer.

Colpaert, J. (2002) Cerberus. In W. Decoo (ed.) Crisis on Campus: Confronting Academic Misconduct 207–234. Cambridge, Massachusetts: MIT Press.

Culwin, F. and Lancaster, T. (2000) A review of electronic services for plagiarism detection in student submissions. In 8th Annual Conference on the Teaching of Computing. Edinburgh: LTSN Centre for Information and Computer Science.

Decoo, W. (2002) Crisis on Campus: Confronting Academic Misconduct. Cambridge, Massachusetts: MIT Press.

Decoo, W. (2008) Substantial, verbatim, unattributed, misleading: Applying criteria to assess textual plagiarism. In T. S. Roberts (ed.) Student Plagiarism in an Online World: Cases and Solutions 228–243. Hershey, Pennsylvania: Information Science Publishing.

Kakkonen, T. and Mozgovoy, M. (2010) Hermetic and Web plagiarism detection systems for student essays – an evaluation of the state-of-the-art. Journal of Educational Computing Research 42(2): 135–159.

Kent, C. K. and Salim, N. (2010) Features based text similarity detection. Journal of Computing 2(1): 53–57.

Koshy, S. (2008) A Case of Miscommunication? Obstacles to the Effective Implementation of a Plagiarism Detection System in a Multicultural University (Research document UQWD-RSC WP-76). Dubai: University of Wollongong.

Lee, C.-H., Wu, C.-H., and Yang, H.-C. (2008) A platform framework for cross-lingual text relatedness evaluation and plagiarism detection. In Proceedings of the 3rd International Conference on Innovative Computing Information and Control 303–304. Washington, DC: IEEE Computer Society.

Potthast, M., Barrón-Cedeño, A., Stein, B., and Rosso, P. (2010) Cross-language plagiarism detection. Language Resources and Evaluation 44(1): online first.

Shi, L. (2004) Textual borrowing in second-language writing. Written Communication 21(2): 171–200.

Shivakumar, N. and Garcia-Molina, H. (1995) SCAM: A copy detection mechanism for digital documents. In Proceedings of the Second Annual Conference on the Theory and Practice of Digital Libraries 398–409. Austin, Texas: Hypermedia Research Lab, Computer Science Department, Texas A&M University.

U.S. Office of Research Integrity. (1994) ORI provides working definition of plagiarism. ORI Newsletter 3(1).

Wan, X. (2008) Beyond topical similarity: A structural similarity measure for retrieving highly similar documents. Knowledge and Information Systems 15(1): 55–73.

Zaka, B. (2009) Theory and Applications of Similarity Detection Techniques (Unpublished doctoral dissertation). Graz: Institute for Information Systems and Computer Media (IICM) – Graz University of Technology. Online:



How to Cite

Decoo, W., & Colpaert, J. (2010). Detection Systems for Text-Based Plagiarism: Developments, Principles, Challenges, and the Aftermath. Writing & Pedagogy, 2(2), 311-320.



From the e-Sphere