Five statistical models for Likert-type experimental data on acceptability judgments

Authors

  • Laura A. Janda UiT The Arctic University of Norway; The National Research University Higher School of Economics in Moscow
  • Anna Endresen UiT, The Arctic University of Norway.

DOI:

https://doi.org/10.1558/jrds.30822

Keywords:

Likert scale, acceptability judgement, experiment, marginal verb, prefix, Russian

Abstract

This paper contributes to the ongoing debate over Likert scale experiments, in particular the issues of how to treat acceptability judgment data (as ordinal or interval) and what statistical model is appropriate to apply. We analyze empirical data on native speakers’ intuitions regarding marginal change-of state verbs in Russian (e.g. ukonkretit’ ‘concretize’, ovnešnit’ ‘externalize’) and compare the outcomes of five statistical models (parametric and non-parametric tests): (1) ANOVA; (2) Ordinal Logistic Regression Model; (3) Mixed-Effects Regression Model for Ordinal data; (4) Regression Tree and Random Forests Model; and (5) Classification Tree and Random Forests Model. We make four claims: (1) all five models are appropriate for this data to a greater or lesser degree; (2) overall, the outcomes of parametric and non-parametric tests applied to this data provide comparable results; (3) Classification Tree and Random Forests Model is the most appropriate, informative, and user-friendly regarding this data; and (4) the use of a culturally entrenched grading scale is an advantage.

Author Biography

Laura A. Janda, UiT The Arctic University of Norway; The National Research University Higher School of Economics in Moscow

Department of Language and Culture Professor

References

Baayen, R. H. (2008). Analysing Linguistic Data. A Practical Introduction to Statistics using R. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511801686

Baayen, R. H., Janda, L. A., Nesset, T., Endresen, A., and Makarova, A. (2013). Making choices in Slavic: Pros and cons of statistical methods for rival forms. Russian Linguistics 37: 253–291. https://doi.org/10.1007/s11185-013-9118-6

Bermel, N. and Knittl, L. (2012). Corpus frequency and acceptability judgments: A study of morphosyntactic variants in Czech. Corpus Linguistics and Linguistic Theory 8 (2): 241–275. https://doi.org/10.1515/cllt-2012-0010

Blaikie, N. (2003). Analyzing Qualitative Data. London: SAGE Publications Ltd. https://doi.org/10.4135/9781849208604

Cantos Gómez, P. (2013). Statistical Methods in Language and Linguistic Research. Sheffield: Equinox Publishing.

Christensen, R. H. B. (2015). Ordinal – Regression Models for Ordinal Data. R package version 2015.6-28. Software and manual retrieved on 19 May 2017 from https://cran.r-project.org/web/packages/ordinal/index.html

Christensen, R. H. B. and Brockhoff, P. B. (2013). Analysis of sensory ratings data with cumulative link models. Journal de la Société Française de Statistique 154 (3): 58–79.

Cohen, L., Manion, L., and Morrison, K. (2000). Research Methods in Education, 5th ed. London: Routledge Falmer. https://doi.org/10.4324/9780203224342

Collins, C., Guitard, S. N., and Wood, J. (2009). Imposters: An online survey of grammaticality judgments. NYU Working Papers in Linguistics 2: Papers in Syntax. Retrieved on 19 May 2017 from http://linguistics.as.nyu.edu/docs/CP/2345/collins_guitard_wood_imposters_online_09_nyuwpl2.pdf

D?browska, E. (2010). Naive vs. expert intuitions: An empirical study of acceptability judgments. The Linguistic Review 27: 1–23. https://doi.org/10.1515/tlir.2010.001

Dubois, D. (2013) Statistical reasoning with set-valued information: Ontic vs. epistemic views. In C. Borgelt, Gil, M. A., Sousa, J. M. C., and Verleysen, M. (Eds) Towards Advanced Data Analysis by Combining Soft Computing and Statistics. Studies in Fuzziness and Soft Computing 285: 119–137. Berlin/Heidelberg: Springer-Verlag.

Endresen, A. (2013). Samostojatel’nye morfemy ili pozicionnye varianty? Morfologi?eskij status russkix pristavok o- i ob- v svete novyx dannyx: korpus i èksperiment [Distinct morphemes or positional variants? Morphological status of the Russian prefixes o- and ob- in the light of new evidence: corpus and experiment]. Voprosy jazykoznanija 6: 33–69.

Endresen, A. (2014). Non-Standard Allomorphy in Russian Prefixes: Corpus, Experimental, and Statistical Exploration. Doctoral dissertation. University of Tromsø: The Arctic University of Norway. Retrieved on 19 May 2017 from http://hdl.handle.net/10037/7098

Faraway, J. J. (2006). Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. Boca Raton, FL: Chapman and Hall/CRC.

Gardner, P. L. (1975). Scales and statistics. Review of Educational Research 45: 43–57. https://doi.org/10.3102/00346543045001043

Grilli, L. and Rampichini, C. (2012). Multilevel models for ordinal data. In R. S. Kenett and S. Salini (Eds) Modern Analysis of Customer Surveys: with Applications using R, 391–408. Chichester: John Wiley and Sons.

Harrell, F. E. (2001). Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer series in Statistics. https://doi.org/10.1007/978-1-4757-3462-1

Haspelmath, M. (2002). Understanding Morphology. London: Oxford University Press.

Jaccard, J. and and Wan, C. K. (1996). LISREL Approaches to Interaction Effects in Multiple Regression. Thousand Oaks, CA: SAGE Publications. https://doi.org/10.4135/9781412984782

Jamieson, S. (2004). Likert scales: How to (ab)use them. Medical Education 38: 1212–1218. https://doi.org/10.1111/j.1365-2929.2004.02012.x

Janda, L. A. (Ed.) (2013). Cognitive Linguistics: The Quantitative Turn. The Essential Reader. Berlin and Boston, MA: De Gruyter Mouton.

Kapatsinski, V. (2013). Conspiring to mean: Experimental and computational evidence for a usage-based harmonic approach to morphophonology. Language 89: 110–148. https://doi.org/10.1353/lan.2013.0003

Keller, F. and Asudeh, A. (2001) Constraints on linguistic coreference: Structural vs. pragmatic factors. In J. D. Moore and Stenning, K. (Eds) Proceedings of the 23rd Annual Conference of the Cognitive Science Society, 483–488. Mahawah, NJ: Lawrence Erlbaum Associates.

Kim, J.-O. (1975). Multivariate analysis of ordinal variables. American Journal of Sociology 81: 261–298. https://doi.org/10.1086/226074

King, B. M. and Minium, E. W. (2008). Statistical Reasoning in the Behavioral Sciences. Hoboken, NJ: Wiley.

Knapp, T. R. (1990). Treating ordinal scales as interval scales: An attempt to resolve the controversy. Nursing Research 39 (2): 121–122. https://doi.org/10.1097/00006199-199003000-00019

Labovitz, S. (1967). Some observations on measurement and statistics. Social Forces 46: 151–160. https://doi.org/10.2307/2574595

Labovitz, S. (1970). The assignment of numbers to rank order categories. American Sociological Review 35: 515–524. https://doi.org/10.2307/2092993

Lavrakas, P. J. (2008). Encyclopedia of Survey Research Methods. Thousand Oaks, CA: SAGE Publications. https://doi.org/10.4135/9781412963947

Likert, R. (1932). A Technique for the Measurement of Attitudes. Doctoral dissertation. Columbia University. Series Archives of Psychology 22: 5–55. NY: The Science Press. Retrieved on 19 May 2017 from http://www.voteview.com/pdf/Likert_1932.pdf

Pell, G. (2005). Use and misuse of Likert scales. Medical Education 39 (9): 970. https://doi.org/10.1111/j.1365-2929.2005.02237.x

R Development Core Team. (2010). R: A Language and Environment for Statistical computing. R Foundation for Statistical Computing. Vienna, Austria. ISBN 3-900051-07-0.

Rietveld, T. and van Hout, R. (2005). Statistics in Language Research: Analysis of Variance. Berlin and New York: Mouton de Gruyter. https://doi.org/10.1515/9783110877809

Schütze, C. T. (1996). The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic Methodology. Chicago, IL and London: The University of Chicago Press.

Strobl, C., Malley, J., and Tutz, G. (2009). An introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods 14 (4): 323–348. https://doi.org/10.1037/a0016973

Tagliamonte, S. A. and Baayen, R. H. (2012). Models, forests and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change 24 (2): 135–178. https://doi.org/10.1017/S0954394512000129

Townsend, Ch. E. (1968). Russian Word-Formation. Bloomington, IN: Slavica Publishers. Reprint edition from 2008.

Published

2017-10-30

How to Cite

Janda, L. A., & Endresen, A. (2017). Five statistical models for Likert-type experimental data on acceptability judgments. Journal of Research Design and Statistics in Linguistics and Communication Science, 3(2), 217-250. https://doi.org/10.1558/jrds.30822

Issue

Section

Articles