The term “Collocation” was first introduced by Firth (1957) to define a combination of words associated with each other, to mean that the meaning and the function of a word could be determined by a habitual occurrence of the word with other words. This theory which is known as the ‘contextual theory of meaning’ claims that the meaning of a word, for example, dark can be determined by the neighbouring word light in the phrase dark light.

The term ‘collocation’ has its origin in the Latin verb ‘collocare’ which means ‘to set in order/to arrange’.

Although collocation has been defined differently by quite large number of scholars, many have come to an agreement that collocation is “the occurrence of two or more words within a short space of each other in a text” (Sinclair, 1991) or the co-occurrence of two or more lexical items as realizations of structural elements within a syntactic pattern (Cowie, 1978). Meanwhile, Bahns and Eldaw (1993) mention that the major characteristics of collocations are that their meanings reflect the meaning of their counterparts and that they are used frequently, spring to mind readily, and are psychologically salient. Collocation ranges in a continuum from very fixed expressions, i.e. idioms, particles, and complex collocations of prepositions to less restricted collocations (allow limited combinability with other words).

There are several approaches to studying collocation: lexical, semantics, and structural approaches, as follows:

  1. The Lexical Approach

Firth is widely regarded as the father of collocation and the developer of a lexical and the most traditional approach to this phenomenon. The supporters of the lexical approach claim that the meaning of a word is determined by the co-occuring words. Thus, a part of the meaning of a word is the fact that it collocates with another word. However, those combinations are often strictly limited, e.g. make an omelette but do your homework.

           One of the Firth’s revolutionary concepts was to perceive lexical relations as syntagmatic rather than paradigmatic ones. Sinclair (1991) and Halliday (1966) are Firth’s followers.

For Halliday, collocations are examples of word combinations; he maintains that collocation cuts across grammar boundaries. For instance, he argued strongly and the strength of his argument are grammatical transformations of the initial collocation strong argument. In his works he highlights the crucial role of collocations in the study of lexis.

Sinclair introduces the terminology: an item whose collocations are studied is called a ‘node’; the number of relevant lexical items on each side of a node is defined as a ‘span’ and those items which are found within the span are called ‘collocates’. Later on Sinclair slightly changes his attitude forming an ‘integrated approach’ and dismisses the previous idea that lexis is rigidly separated from grammar. In this new approach both the lexical and grammatical aspects of collocation are taken into consideration. As a result, Sinclair (1991) divides collocations into two categories: the ‘upward’ and ‘downward’ collocations. The first group consists of words which habitually collocate with the words more frequently used in English than they are themselves, e.g. back collocates with at, down, from, into, on, all of which are more frequent words than back. Similarly, the ‘downward’ collocations are words which habitually collocate with words that are less frequent than they are, e.g. words arrive, bring are less frequently occurring collocates of back. Sinclair makes a sharp distinction between those two categories claiming that the elements of the ‘upward’ collocation (mostly prepositions, adverbs, conjunctions, pronouns) tend to form grammatical frames while the elements of the ‘downward’ collocation (mostly nouns and verbs) by contrast give a semantic analysis of a word.

  1. The Semantic Approach

This approach goes beyond the sheer observation of collocations and tries to determine their specific shape. Its supporters attempt to examine collocations from the semantic point of view, also separately form of grammar. Their main goal is to find out why words collocate with certain other words, e.g. why we can say blonde hair but not blonde car. This question still represents a challenge for linguists today.

  1. The Structural Approach

According to this approach, collocation is determined by structure and occurs in patterns. Therefore, the study of collocation should include grammar (Gitsaki, 1996), which contrasts with the two aforementioned approaches: the lexical and semantic ones. Lexis and grammar cannot be separated and, consequently, two categories are defined: lexical and grammatical collocation, which represent two distinctive but related aspects of one phenomenon. Grammatical collocations usually consist of a noun, an adjective or a verb plus a preposition or a grammatical structure such as ‘to+infinitive’ or ‘that-clause’, e.g. by accident, to be afraid that. Lexical collocations do not contain grammatical elements, but are combinations of nouns, adjectives, verbs, adverbs (Bahns 1993). Benson, Benson and Ilson (1997) define collocation as specified, identifiable, non-idiomatic, recurrent combinations. In their dictionary they divide them into two groups: grammatical and lexical collocations. The first category consists of the main word (a noun, an adjective, a verb) plus a preposition or ‘to+infinitive’ or ‘that-clause’ and is characterized by 5 basic types of collocations.

Lexical collocations do not contain prepositions, infinitives or relative clauses but consist of nouns, adjectives, verbs and adverbs. There are 6 types of them:

Type Examples
Grammatical Collocations:

·         Verb + Preposition

·         Adjective + Preposition

·         Adjective + Preposition + Preposition

·         Preposition + Noun

·         Dative movement transformation


·         (to) get at, (to) go for

·         Different from, curious about, full of.

·         Fed up with.

·         For sale, on time.

·         She sent the book to him/She sent him the book

Lexical Collocations:

·         verb + noun (pronoun, prepositional phrase

·         adjective + noun

·         noun + verb

·         noun + of + noun

·         adverb + adjective

·         verb + adverb


(to) reach a verdict, (to) launch a missile, (to) lift a ) blockade, (to) revoke a license

reckless abandon, sweeping generalization

adjectives modify, alarms go off

a bunch of flowers, a piece of advice

deeply religious, fiercely independent

(to) apologize humbly, (to) affect deeply

Kjellmer (1990) tries to establish to what extent individual word classes are ‘collocational’ or ‘non-collocational’ in character. The results of his research show that articles, prepositions, singular and mass nouns as well as the base forms of verbs were collocational in their nature whereas adjectives, singular proper nouns and adverbs were not. Kjellmer claims that English words are scattered across a continuum which extends from those items whose contextual company is entirely predictable to those whose contextual company is entirely unpredictable. According to his results, most words tend to appear towards the beginning of the continuum, which can also be described as a scale of fixedness of collocation. Then it extends from totally free, unrestricted combinations to totally fixed and invariable ones. Kjellmer’s theory about collocational continuum is relevant also in regard to lexical collocations although they are linked together in a different way than grammatical ones, that is they refer more to semantics.

Lewis (2000) argues that most collocations are found in the middle of this continuum, which means that there are very few ‘strong’ collocations. He makes a distinction between ‘strong’ collocation e.g. avid reader, budding author; ‘common’ collocation which makes up numerous word combinations, e.g. fast car, have dinner, a bit tired and ‘medium strong’ one, which in his view account for the largest part of the lexis a language learner needs, e.g. magnificent house, significantly different. Hill adds one more category – ‘unique’ collocation such as to foot the bill, shrug one’shoulders. In terms of the strength of collocation, it is worth noting that it is not reciprocal, which means that the strength between the words is not equal on both sides, e.g. blonde and hair. Blonde collocates only with a limited number of words describing hair colour whereas hair collocates with many words, e.g. brown, long, short, mousy. It happens very often that the bond between the words is unilateral, e.g. in the phrase vested interest, vested only ever collocates with interest but interest collocates with many other words.

Hunston (1997) concluded that there are correlations between grammatical patterns and lexical meaning. All words can be represented by specific patterns and the meanings of words which share patterns have a lot in common. That means that a word has a specific meaning when it co-occurs with a certain word. This hypothesis is followed by Hoey (2000), who maintains that some meanings of the same word have their own grammatical patterns, which is called ‘colligation’. This concept started by Firth is concerned with relationship between grammatical classes, whereas collocation is concerned with the words which belong to these grammatical classes. Grammatical pattern [verb+to-infinitive] is an example of colligation and [dread+think] is an example collocation of this colligation. In short, colligation defines the grammatical company and interaction of words as well as their preferable position in a sentence. Another key point in the study of collocation started by Firth is the notion of syntagmatic (horizontal) as opposed to paradigmatic (vertical) relationship between its elements. In the syntagmatic dimension we can clearly see the relationship between linearly lined up words, which make up an individual syntactic unit, here a collocation. In the sentence: It writhed on the floor in agonizing pain the syntagmatic relationship is the one between the words: writhed, floor, agonizing and pain, whereas the paradigmatic relationship is between a word and a group of words which can replace it in this sentence:

It          writhed           on the floor   in        agonizing      pain.

bed                             burning

pavement                  stabbing

paradigm1                 paradigm2

Lewis (1994) defines collocation as a subcategory of multi-word items, made up of individual words which habitually co-occur and can be found within the free-fixed collocational continuum. In his opinion, they differ from another important subcategory of multi-word items called institutionalized expressions because collocations tell more about the content of what a language user expresses rather than what the language user is doing, e.g. apologizing or denying. Lewis (1997) points out that collocation is not determined by logic or frequency but is arbitrary, decided only by linguist convention. Dzierżanowska (1988) adds that words that make up collocation do not combine with each other at random. Collocation cannot be invented by a second language user. A native speaker uses them instinctively. In every language collocations comply with the rules characteristic of that language and therefore they cause serious problems both for

learners and translators, e.g. menggapai tujuan has two English equivalents achieve/reach an aim but _____ can be translated with the verb reach but not achievereach an agreement. Consequently, collocations must be memorized or looked up in an adequate dictionary.

Celce-Murcia (1991) defines collocation as a co-occurrence of lexical items in combinations, which can differ in frequency or acceptability. Items which collocate frequently with each other are called ‘habitual’, e.g. tell a story, whereas those which cannot co-occur are called ‘unacceptable’, e.g. *powerful tea instead of strong tea.

Similarly, in Carter’s view (1987), collocation is a group of words that recurrently co-occur in a language. He agrees with Benson that there are grammatical collocations which result from grammatical relationship between the words and lexical collocations which result not only from grammatical relationship, but most of all from co-occurrence of lexical units in a specific company. The total number of words which can collocate with an X word is called a ‘cluster’ of X. He also points out that certain elements of a cluster are more central than other, which means that they are more likely to co-occur with X. Carter divides collocations into four categories, depending on how restricted they are: ‘unrestricted’, which collocate freely with a number of lexical items, e.g. take a look/a holiday/a rest/a letter/time/notice/a walk; ‘semi-restricted’, in which the number of adequate substitutes which can replace the elements of collocation is more limited, e.g. harbor doubt/grudges/uncertainty/suspicion. The other two categories include ‘familiar’ collocations whose elements collocate on a regular basis, e.g. unrequited love, lukewarm reception and ‘restricted’ collocations which are fixed and inflexible, e.g. dead drunk, pretty sure. Carter distinguishes between ‘core’ and ‘non-core’ words claiming that the more core a lexical item is, the more frequently it collocates. Core words are more central in a language than other, non-core words and that is why the non-core words can be defined or replaced by the core items, e.g. eat is a core word for gobble, dine, devour, stuff, gormandize because its meaning is the basic meaning of every item from the group but this relationship is not reciprocal. In Carter’s view, words are scattered across a core–non-core continuum and their position on this scale determines their collocability. The nearer to the core end of the continuum a word is, the more frequently it collocates, e.g. bright >radiant>gaudy:

bright: sun/light/sky/idea/colour/red/future/prospects/child

radiant: sun/light/smile

gaudy: colour

According to a dictionary definition (Szulc, 1984), collocation is an ability of lexical items to build steady, conventionalized syntagmatic relationship with other words, e.g. putrid, rotten, rancid and addled are synonyms which designate rotten food but they collocate only with a limited number of words: putrid fish, rancid butter/oil, addled eggs, rotten fruit. Individual collocations are determined by the lexical system of a language and can result from historical changes.

According to Oxford Collocations Dictionary (2002), collocation is a means of combining words in a language to produce natural-sounding speech and writing. Incorrect combinations such as heavy wind or strong rain do not sound naturally in English. Apart from the prevalent grammatical/lexical distinction, the authors also mention ‘word’ collocation, none of whose elements can be replaced even with its synonym, e.g. small fortune but not *little fortune and ‘category’ collocation whose elements can collocate with any items of a precisely determined group of words. This group can be quite large and its elements- predictable because they make up the same category, e.g. measurements of time for a noun walk: five minutes’ walk/three-minute walk.

Why are collocations important? Collocations have been claimed to be dominant in academic texts especially in the texts of specialised disciplines (e.g.. law, medicine, biology, etc.) where they become the basic building blocks of specialised language and constitute the expressions of knowledge, concepts, and ideas in these discourses (Halliday, 1992). They also perform specific functions and are the organising thoughts in those texts (Fuentes, 2001). Students who are competent in collocation (have collocational competence) are regarded as those who have attained an advanced or higher level of English fluency or communicative competence (Hill, 2000). Collocation knowledge becomes the determinant factor for students’ success in their academic and professional careers (Howarth 1998). In addition, learning vocabulary in chunks may expedite the second language acquisition process. Since our short term memory (STM) can only remember a few words at a time, storing word phrases which are meaningful rather than discrete single word items may facilitate and ease the retrieval of the phrases from our mental lexicon. In this way, it resembles the acquisition of one’s first language (Wray, 2002).


