Measuring semantic distance across time

An analysis of the collocational profiles of a set of near-synonyms in American English


  • Daniela Pettersson-Traba Universidad de Extremadura



near synonymy, collocation, semantic vector spaces, collocational networks, diachrony


Over the last decades, several studies have analyzed the collocational preferences of particular sets of near synonyms from a synchronic viewpoint, while their diachronic development has generally been disregarded. The aim of this paper is to partially fill this gap by examining the collocational behavior of the adjectives fragrant, perfumed, and scented, which denote the concept sweet smelling, over the time span 1810–2009. To this purpose, instances of the three near-synonyms and their L5–R5 collocates were extracted from the Corpus of Historical American English (COHA) and then submitted to statistical modeling. Results indicate that, at the beginning of the time span analyzed, the collocational preferences of scented and perfumed are very similar but, over time, scented becomes semantically closer to fragrant, while at the same time taking over some of its functions.

Author Biography

Daniela Pettersson-Traba, Universidad de Extremadura

Daniela Pettersson-Traba holds a BA in English Language and Literature and an MA in Advanced English Studies. Between 2016–2019 she was a full-time post-graduate researcher at the Department of English and German of the University of Santiago de Compostela, under funding from the Regional Government of Galicia (ref. ED481A-2016/168). Daniela is currently working at the University of Extremadura. Her research interests include diachronic variation in English, corpus linguistics, and semantics.


Archer, D., Wilson, A., and Rayson, P. (2002). Introduction to the USAS category system, Benedict Project Report: 1–37. (accessed 19 April 2020).

Baker, P. (2017). American and British English. Divided by a Common Language? Cambridge: Cambridge University Press.

Bolinger, D. (1977). Meaning and Form. London: Longman.

Brezina, V., McEnery, T., and Wattam, S. (2015). Collocations in context: A new perspective on collocation networks. International Journal of Corpus Linguistics 20: 139–173.

Cambridge Dictionary. (2019). [last accessed 23 Nov­ember 2019].

Collins online Unabridged English Dictionary. (2012–). [last accessed 23 November 2019].

Church, K. W., and Hanks P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics 16: 76–83.

Church, K. W., Gale, W., Hanks, P., and Hindle, D. (1991). Using statistics in lexical analysis. In U. Zernik (Ed.) Lexical Acquisition: Exploiting On-line Resources to build a Lexicon, 115–164. Hillsdale, NJ: Lawrence Erlbaum.

Church, K. W., Gale, W., Hindle, D., and Rosamund M. (1994). Lexical substitutability. In B. Levin and A. Zampolli (Eds) Computational Approaches to the Lexicon, 153–177. Oxford and New York: Oxford University Press.

Croft, W. (2000). Explaining Language Change: An Evolutionary Approach. Essex: Pearson Education.

Cruse, A. D. (2000). Meaning in Language: An Introduction to Semantics and Pragmatics. Oxford: Oxford University Press.

Csardi G. and Nepusz T. (2006). The igraph software package for complex network research. InterJournal Complex Systems 1695.

Davies, M. (2010–). The Corpus of Historical American English (COHA): 400 million words, 1810–2009. [last accessed: 8 November 2019]

De Smet. H., D’hoedt, F., Fonteyn, L., and Van Goetham, K. (2018). The changing functions of competing forms: Attraction and differentiation. Cognitive Linguistics 29: 197–234.

Desagulier, G. (2014). Visualizing distances in a set of near-synonyms: Rather, quite, fairly, and pretty. In D. Glynn and J. A. Robinson (Eds) Corpus Methods for Semantics: Quantitative Studies in Polysemy and Synonymy, 145–178. Amsterdam and Philadelphia: John Benjamins.

Divjak, D. (2010). Structuring the Lexicon: A Clustered Model for Near-synonymy. Berlin and New York: Mouton de Gruyter.

Divjak, D., and Gries, S. Th. (2006), Ways of trying in Russian: Clustering behavioral profiles. Corpus Linguistics and Linguistic Theory 2: 23–60.

Divjak, D., and Gries, S. Th. (2008). Clusters in the mind? Converging evidence from near synonymy in Russian. The Mental Lexicon 3: 188–213.

Firth, J. R. (1957). Papers in Linguistics, 1934–1951. London and New York: Oxford University Press.

Geeraerts, D. (1986). On necessary and sufficient conditions. Journal of Semantics 5: 275–291.

Gries, S. Th. (2001). A corpus-linguistic analysis of English -ic vs -ical adjectives. ICAME Journal 25: 65–108.

Gries. S. Th. (2003). Testing the sub-test: An analysis of -ic and -ical adjectives. International Journal of Corpus Linguistics 8: 31–61.

Gries, S. Th. (2013). 50-something years of work on collocations: What is or should be next … . International Journal of Corpus Linguistics 18 (1): 137–165.

Heylen, K., Speelman, D., and Geeraerts, D. (2012). Looking at word meaning. An interactive visualization of semantic vector spaces for Dutch synsets. In M. Butt, S. Carpendale, G. Penn, J. Prokic and M. Cysouw (Eds) Proceedings of the EACL–2012 joint workshop of LINGVIS & UNCLH: Visualization of language patterns and uncovering language history from multilingual resources, 16–24. Stroudsburg: Association for Computational Linguistics.

Hilpert, M. and Correia Saavedra, D. (2017). Using token-based semantic vector spaces for corpus-linguistic analyses: From practical applications to tests of theoretical claims. Corpus Linguistics and Linguistic Theory, Ahead of Print: 1–32.

Jackson, H. (1988). Words and their Meanings. London: Longman.

Jansegers, M. and Gries, S. Th. (2020). Towards a dynamic Behavioral Profile: A diachronic study of polysemous sentir in Spanish. Corpus Linguistics and Linguistic Theory, Ahead of Print: 16(1): 145–187.

Jones, M. A. (1996). Historia de Estados Unidos 1607–1992. Translated by Carmen Martínez Gimeno. Madrid: Cátedra.

Kjellmer, G. (2003). Synonymy and corpus work: On almost and nearly. ICAME Journal 27: 19–27.

Levshina, N. (2015). How to do Linguistics with R. Amsterdam and Philadelphia: John Benjamins.

Lexico (2019). [last accessed 23 November 2019]

Longman Dictionary of Contemporary English. (2015–). [last accessed 23 November 2019].

Liu, D. (2010). Is it a chief, main, major, primary, or principal concern? A corpus-based behavioral profile study of the near-synonyms. International Journal of Corpus Linguistics 15: 56–87.

Liu, D. (2013). Salience and construal in the use of synonymy: A study of two sets of near-synonymous nouns. Cognitive Linguistics 24: 67–113.

Liu, D. and Espino, M. (2012). Actually, genuinely, really, and truly. A corpus-based behavioral profile study of the near-synonymous adverbs. International Journal of Corpus Linguistics 17: 198–228.

Macmillan Dictionary. (2009–). [last accessed 23 November 2019].

Merriam Webster Dictionary and Thesaurus. (2019). [last accessed 23November 2019].

Murphy, M. L. (2003). Semantic Relations and the Lexicon. Antonymy, Synonymy, and Other Paradigms. Cambridge: Cambridge University Press.

Newbury House Dictionary of American English. (2019). [last accessed 23 November 2019].

Oxford English Dictionary. 3rd edition (2012–). [last accessed 23 Nov­ember 2019].

Peirsman, Y., Heylen, K. and Geeraerts D. (2008). Size matters. Tight and loose context definitions in English word space models. In M. Baroni, S. Evart and A. Lessi (Eds) Proceedings of the ESSLLI workshop on distributional lexical semantics: Bridging the gap between semantic theory and computational linguistics, 34–41. Hamburg: ESSLII.

Pettersson-Traba, D., (2018). A diachronic perspective on near-synonymy: The concept of SWEET-SMELLING in American English. Corpus Linguistics and Linguistic Theory, Ahead of Print: 1–31.

Primahadi-Wijaya-R., G. and Rajeg, I M. (2014). Visualising diachronic change in the collocational profiles of lexical near-synonyms. In I. N. Sudipa and G. Primahadi-Wijaya-R. (Eds) Cahaya Bahasa: A Festschrift in Honour of Prof. I Gusti Made Sutjaja, 247–258. Denpasar: Swasta Nulus.

R Core Team. (2017). R: A Language and Environment for Statistical Computing (version 3.4.3). Vienna: R Foundation for Statistical Computing.

Sahlgren, M. (2006). The word-space model. Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis. Stockholm: Stockholm University.

Samuels, M. L. (1972). Linguistic Evolution with Special Reference to English. London and New York: Cambridge University Press.

Sinclair, J. (1966). Beginning the study of Lexis. In C. E. Bazell, J. C. Catford, M. A. K. Halliday, and R. H. Robins (Eds) In Memory of J. R. Firth, 410–430. Harlow: Longman.

Stubbs, M. (2001). Words and Phrases: Corpus Studies of Lexical Semantics. Oxford: Blackwell.

Taylor, J. R. (2003). Near Synonyms as co-extensive categories: ‘High’ and ‘Tall’ Revisited. Language Sciences 25: 263–284.



How to Cite

Pettersson-Traba, D. . (2021). Measuring semantic distance across time : An analysis of the collocational profiles of a set of near-synonyms in American English. Journal of Research Design and Statistics in Linguistics and Communication Science, 6(2), 138–165.