Exploring the Use of Machine Learning to Automate the Qualitative Coding of Church-related Tweets


  • Anthony-Paul Cooper Durham University
  • Emmanuel Awuni Kolog University of Ghana Business School
  • Erkki Sutinen University of Turku




digital theology, machine learning, sociology of religion, social media research


This article builds on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning Precision, Recall and F-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, means that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interacts with, and talks about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.


Download data is not yet available.

Author Biographies

Anthony-Paul Cooper, Durham University

Anthony-Paul Cooper is co-director of the Centre for Church Growth Research at Cranmer Hall, Durham University. Anthony-Paul has a background in social research, with previous research topics including new church use of “secular” and “sacred” space and the use of social media data to better understand church attendance and church growth.

Emmanuel Awuni Kolog, University of Ghana Business School

Emmanuel Awuni Kolog is a faculty member at the Department of Operations and Management Information Systems of the University of Ghana Business School. Emmanuel’s research interest is multidisciplinary which spans the fields of text mining, affect detection, learner analytics, machine learning applications and business intelligence.

Erkki Sutinen, University of Turku

Erkki Sutinen is Professor of Computer Science at the University of Turku and an ordained priest. Erkki’s research interests include educational technology, computing education, ICT4D, co-design and digital theology. He has supervised circa 30 PhDs and co-authored around 300 papers. Erkki is currently based in Windhoek, Namibia, having recently set up the first overseas campus of the University of Turku.


Bobkowski, Piotr S., and Lisa D. Pearce 2011 Baring their Souls in Online Profiles or Not? Religious Self-disclosure in Social Media. Journal for the Scientific Study of Religion 50(4): 744–62. https://doi.org/10.1111/j.1468-5906.2011.01597.x

Burgess, Regina L. 2013 Understanding Christian Blogger Motivations: Woe unto Me If I Blog Not the Gospel. Journal of Religion, Media and Digital Culture 2(2): 1–42. https://doi.org/10.1163/21659214-90000030

Campbell, Heidi 2012 Understanding the Relationship between Religion Online and Offline in a Networked Society. Journal of the American Academy of Religion 80(1): 64–93. https://doi.org/10.1093/jaarel/lfr074

Chen, Nan-Chen, Rafal Kocielnik, Margaret Drouhard, Vanessa Peña-Araya, Jina Suh, Keting Cen, Xiangyi Zheng and Cecilia R. Aragon. 2016 Challenges of Applying Machine Learning to Qualitative Coding. Paper presented at CHI 2016 Workshop on Human Centred Machine Learning, San Jose, CA, USA, 7–12 May 2016. http://chi2016.acm.org/wp/ https://doi.org/10.1109/pacificvis.2017.8031598

Cheng, Zhiyuan, James Caverlee and Kyumin Lee 2010 You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, Ontario, Canada, 26–30 October 2010, 759–68. https://doi.org/10.1145/1871437.1871535

Codone, Susan 2014 Megachurch Pastor Twitter Activity: An Analysis of Rick Warren and Andy Stanley, Two of America’s Social Pastors. Journal of Religion, Media and Digital Culture 3(2): 1–32. https://doi.org/10.1163/21659214-90000050

Cooper, Anthony-Paul 2014 Unwrapping Camden’s Church Tweeters: A Small-scale Thematic Study of Twitter Data. In Social Media in Social Research: Blogs on Blurring the Boundaries, edited by K. Woodfield. eBook. London: NatCen Social Research.

Assessing the Possible Relationship between the Sentiment of Church-related Tweets and Church Growth. Studies in Religion/Sciences Religieuses 46(1): 37–49. https://doi.org/10.1177/0008429816664215

Using Geotagged Twitter Data to Uncover Hidden Church Populations. In The Desecularisation of the City: London’s Churches 1980 to the Present, edited by G. Goodhew and A. P. Cooper, 134–47. Abingdon: Routledge. https://doi.org/10.4324/9781351167765-6

Cooper, Anthony-Paul, Joshua Mann, Erkki Sutinen and Peter Phillips 2020 Understanding London’s Church Tweeters: A Content Analysis of Church-Related Tweets Posted from a Global City. Manuscript submitted for publication.

Crowston, Kevin, Xiaozhong Liu, Eileen E. Allen and Robert Heckman. 2010 Machine Learning and Rule-based Automated Coding of Qualitative Data. Paper presented at ASIST 2010, Pittsburgh, PA, USA, 22–27 October 2010.

Dann, Stephen 2010 Twitter Content Classification. First Monday 15(2).

Holmberg, Kim, Johan Bastubacka and Mike Thelwall 2016 @God Please Open Your Fridge! A Content Analysis of Twitter Messages to @God: Hopes, Humour, Spirituality, and Profanities. Journal of Religion, Media and Digital Culture 5(2): 339–55. https://doi.org/10.1163/21659214-90000085

Hutchings, Tim 2007 Creating Church Online: A Case-study Approach to Religious Experience. Studies in World Christianity 13(3): 243–60. https://doi.org/10.3366/swc.2007.13.3.243

Kaur, Gaganjot, and Amit Chhabra 2014 Improved J48 Classification Algorithm for the Prediction of Diabetes. International Journal of Computer Applications 98(22): 13–17. https://doi.org/10.5120/17314-7433

Kolog, Emmanuel Awuni 2018 Detecting Emotions in Students’ Generated Content: An Evaluation of EmoTect System. In Technology in Education. Innovative Solutions and Practices, edited by S. Cheung, J. Lam, K. Li, O. Au, W. Ma and W. Ho, 235–48. ICTE 2018: Communications in Computer and Information Science, Vol. 843. https://doi.org/10.1007/978-981-13-0008-0_22

Kolog, Emmanuel Awuni, Erkki Sutinen and Eeva Nygren 2016 Hackathon for Learning Digital Theology in Computer Science. Modern Education and Computer Science 6: 1–12. https://doi.org/10.5815/ijmecs.2016.06.01

Mitchell, Tom M. 1997 Machine Learning. New York: McGraw-Hill.

Naaman, Mor, Jeffrey Boase and Chih-Hui Lai 2010 Is It Really about Me? Message Content in Social Awareness Streams. Paper presented at CSCW 2010, Savannah, Georgia, USA, 6–10 February 2010. https://doi.org/10.1145/1718918.1718953

Scharkow, Michael 2011 Online Content Analysis Using Supervised Machine Learning—an Empirical Evaluation. Paper presented at International Communication Association (ICA) Conference 2011, Boston, USA, 20–30 May 2011.

Taheri, Sona, Musa Mammodov and A. M. Bagirov 2011 Improving Naive Bayes Classifier Using Conditional Probabilities. Paper 121 presented at 9th Australian Data Mining Conference, Ballarat, Victoria, Australia, 1–2 December 2011, 63–68.



How to Cite

Cooper, A.-P., Kolog, E. A., & Sutinen, E. (2020). Exploring the Use of Machine Learning to Automate the Qualitative Coding of Church-related Tweets. Fieldwork in Religion, 14(2), 140–159. https://doi.org/10.1558/firn.40610