Authorship attribution, idiolectal style, and online identity

A specialised corpus of Najdi Arabic Tweets

Authors

  • Mashael AlAmr King Saud University

DOI:

https://doi.org/10.1558/ijsll.27343

Keywords:

authorship analysis, forensic linguistics, corpus linguistics, online identity, computer-mediated discourse analysis (CMDA), machine learning

Abstract

Assistant Professor
Department of English Language
College of Language Sciences
King Saud University
Riyadh 11495 Kingdom of Saudi Arabia

Awarding Institution: University of Leeds, UK
Date of Award: 14 October 2022

Author Biography

  • Mashael AlAmr, King Saud University

    Mashael AlAmr is an assistant professor at the department of English Language, College of Language Sciences at King Saud University.

References

Ainsworth, J. and Juola, P. (2019) Who wrote this: modern forensic authorship analysis as a model for valid forensic science. Washington University Law Review 96: 1161–1189.

Bucholtz, M. and Hall, K. (2005) Identity and interaction: a sociocultural linguistic approach. Discourse Studies 7: 585–614. https://doi.org/10.1177/1461445605054407

Coulthard, M. (2004) Author identification, idiolect, and linguistic uniqueness. Applied Linguistics 25: 431–447. https://doi.org/10.1093/applin/25.4.431

Coulthard, M. and Johnson, A. (2007) An Introduction to Forensic Linguistics: Language in Evidence. London: Routledge.

Grant, T. (2007) Quantifying evidence in forensic authorship analysis. International Journal of Speech, Language and the Law 14(1): 1–25. https://doi.org/10.1558/ijsll.v14i1.1

Grant, T. (2013) TXT 4N6: Method, consistency, and distinctiveness in the analysis of sms text messages. Journal of Law and Policy 21(2): 467–494.

Grant, T. and Baker, K. (2001) Identifying reliable, valid markers of authorship: a response to Chaski. International Journal of Speech, Language and the Law 8(1): 66–79. https://doi.org/10.1558/sll.2001.8.1.66

Grant, T. and Macleod, N. (2020) Language and Online Identities: The Undercover Policing of Internet Sexual Crime. Cambridge: Cambridge University Press.

Herring, S. C. (2007) A faceted classification scheme for computer-mediated discourse. Language@Internet 4, article 1.

Heydon, G. (2019) Researching Forensic Linguistics: Approaches and Applications. London: Routledge.

Ishihara, S. (2017) Strength of forensic text comparison evidence from stylometric features: a multivariate likelihood ratio-based analysis. International Journal of Speech, Language and the Law 24(1): 67–98. https://doi.org/10.1558/ijsll.30305

Johnson, A. and Wright, D. (2014) Identifying idiolect in forensic authorship attribution: an n-gram textbite approach. Language and Law/Linguagem e Direito 1: 37–69.

Juola, P. (2008) Authorship attribution. Foundations and Trends in Information Retrieval 1(3): 233–334. https://doi.org/10.1561/1500000005

Koppel, M., Schler, J. and Argamon, S. (2009) Computational methods in authorship attribution. Journal of the American Society for Information Science and Technology 60: 9–26. https://doi.org/10.1002/asi.20961

Koppel, M., Schler, J. and Argamon, S. (2011) Authorship attribution in the wild. Language Resources and Evaluation 45: 83–94. https://doi.org/10.1007/s10579-009-9111-2

Koppel, M., Schler, J., Argamon, S. and Winter, Y. (2012) The ‘fundamental problem’ of authorship attribution. English Studies 93: 284–291. https://doi.org/10.1080/0013838X.2012.668794

Larner, S. (2014) Forensic Authorship Analysis and the World Wide Web. Basingstoke: Palgrave.

Mansour, M. A. (2013) The absence of Arabic corpus linguistics: a call for creating an Arabic national corpus. International Journal of Humanities and Social Science 3(12): 81–90.

McMenamin, G. R. (2002) Forensic Linguistics: Advances in Forensic Stylistics. Boca Raton, FL: CRC Press.

Rocha, A., Scheirer, W. J., Forstall, C. W., Cavalcante, T., Theophilo, A., Shen, B., Caravalho, A. R. and Stamatatos, E. (2016) Authorship attribution for social media forensics. IEEE Transactions on Information Forensics and Security 12(1): 5–33. https://doi.org/10.1109/TIFS.2016.2603960

Turell, M. T. (2010) The use of textual, grammatical and sociolinguistic evidence in forensic text comparison. International Journal of Speech, Language and the Law 17(2): 211–250. https://doi.org/10.1558/ijsll.v17i2.211

Turell, M. T. and Gavalda, N. (2012) Towards an index of idiolectal similitude (or distance) in forensic authorship analysis. Journal of Law and Policy 21(2): 495–514.

Witten, I. H., Frank, E., Hall, M. and Pal, C. J. (2016) Data Mining: Practical Machine Learning Tools and Techniques. Amsterdam: Elsevier.

Wright, D. (2013) Using corpora in forensic authorship analysis: Investigating idiolect in Enron emails. Corpus Linguistics. PhD Thesis, University of Lancaster.

Zheng, R., Li, J., Chen, H. and Huang, Z. (2006) A framework for authorship identification of online messages: writing-style features and classification techniques. Journal of the American Society for Information Science and Technology 57: 378–393. https://doi.org/10.1002/asi.20316

Published

2024-03-06

Issue

Section

Thesis Abstracts

How to Cite

AlAmr, M. (2024). Authorship attribution, idiolectal style, and online identity: A specialised corpus of Najdi Arabic Tweets. International Journal of Speech, Language and the Law. https://doi.org/10.1558/ijsll.27343