30 December 2015

Wheels Are Made for Rollin’

Reduplicated nouns certainly existed in Proto-Indo-European, but they are a poorly investigated species. I will leave aside onomatopoeic reduplication, when the echo consists of at least a CVC sequence, as in Proto-Slavic *golgolъ ‘speech’, Greek bárbaros ‘foreign’ (that is, speaking incomprehensibly), Latin murmur (no gloss necessary), or when the whole stem is repeated, as in Hittite harsiharsi- ‘thunderstorm’. There is a more interesting type in which reduplication is “grammatical” rather than purely iconic, the echo template is CV, and only the consonant is copied from the base. The showcase specimen is the celebrated word for ‘wheel’, *kʷékʷlos. It is not attested in the Anatolian subfamily, so its Proto-Indo-European status is uncertain, but it dates back at least to the common ancestor of Core Indo-European.¹

A Bronze Age sun chariot
[source: the National Museum of Denmark, Copenhagen]

The ‘wheel’ word is interesting for several reasons. Not all of them need to concern us here. Wheeled transport (in combination with horse domestication) is supposed to have played a crucial role in the early migrations of the Indo-European-speakers, and consequently in the expansion of the Indo-European languages. The appearance of a “technological package” containing terms for ‘wheel’, ‘axle’, ‘cart/wagon’, etc. marks the onset of these historical processes. But I shall concentrate on the linguistic properties of the word, not its cultural importance. The latter is relevant only as an “ecological” factor favouring the frequent use of the word, its successful survival and rich attestation.

*kʷékʷlos is an original masculine – or, if it dates back to Proto-Indo-European after all, an animate, non-neuter noun. One of its forms is conspicuous by its unusually high survival rate – the collective *kʷəkʷláh₂ (see below for details of the reconstruction). It must have been used very frequently, for it tends to occur instead of the expected masculine plural. In Homeric Greek, for example, kúklos has an irregular plural, kúkla (as if the word were neuter rather than masculine). This is quite striking, because the use of the old PIE collective with animate nouns, still productive in Old Hittite, became extremely rare in Core IE. The collective, co-opted already in PIE as the ordinary nominative/accusative plural of the neuter gender, came to be associated exclusively with neuters in most daughter languages. Wheels, however, are more often spoken of  as fixed sets (the two wheels of a chariot, the four wheels of a wagon) than as an arbitrary number of individual objects. The fact that *kʷəkʷláh₂ is preserved so well shows that the word was applied to wheels as vehicle parts when the collective was still a living grammatical category, contrasting with the count plural.³

Let’s take *kʷékʷlos apart into its morphological constituents: *kʷe-kʷl-o-s. The core part is *-kʷl-, in which we can recognise the very common verb root *kʷelh₁- ‘move round, follow one’s course’ (with a variety of secondary meanings, such as ‘become, stay around, inhabit, observe, cultivate, take care of’ and the like). The phonetic reduction of the root, resulting in the loss of the laryngeal segment *h₁, is a normal phenomenon in compounds and reduplications. The reduplicated noun is thematic (has a stem ending in the vocalic suffix *-o-), which suggests adjectival origin. Collectives of o-stems were formed by adding the *-h₂ suffix to the stem-final vowel in the e-grade: *-e-h₂ → *-ah₂. If the singular had initial accent, the collective was accented on the ending (*-áh₂). This accent shift happened early enough to affect the vocalism of some nouns (from a sufficiently old lexical stock). It is therefore probable that the collective was *kʷəkʷláh₂, with a weak prop-vowel rather than a full-grade *e in the first syllable. This would explain the development of the word in Greek: *kʷəkʷ- > *kukʷ- (with the prop-vowel “stealing” lip-rounding from the preceding labiovelar) > kuk- (with a regular delabialisation of * after /u/).

As the accentual difference between the singular and the collective became non-productive, the paradigm was levelled out in various ways to eliminate the mismatch; that is why the accent is consistently initial in Greek (generalised from the singular) and consistently final in Vedic (from the collective). Since *kʷ(e)kʷláh₂ looks like a neuter plural, speakers were tempted to supply an innovated neuter singular to match, *kʷ(e)kʷlóm, instead of the inherited masculine (hence e.g. Vedic cakrám beside much rarer cakrás). The function of the “echo” prefix *kʷé-/*kʷə- isn’t entirely clear, but judging from cross-linguistic tendencies we can speculate that reduplication gave the underlying verb root an iterative colouring (‘go round and round and round’ rather than ‘complete a turn’).

While rare, the derivational pattern visible in the ‘wheel’ word (a thematic noun formed from a reduplicated verb root) is not isolated, and can be found also on the Anatolian side of the oldest split in the Indo-European family-tree. For example, the Hittite word for ‘rake’ was hah(ha)ra-, plausibly reconstructed as *h₂áh₂ro- ← *h₂e-h₂rh₃-o-. Here the root is *h₂arh₃- ‘break the soil, plough’, as in Greek  aróō, Proto-Slavic *orjǫ, Old English erian (all meaning ‘to plough’), or in the widespread Neo-Indo-European instrument noun  *h₂árh₃-trom ‘ard, plough’ (Greek árotron, Old Norse arðr).

The behaviour of the ‘wheel’ word in Germanic so interesting and instructive that it deserves to be covered in a separate post (to appear soon).

¹ My use of the terms “Core Indo-European” and “Neo-Indo-European” is explained here.

² Of course the restoration of *e on the analogy of the singular was possible, and it certainly happened in some branches of Indo-European.

³ Note the semantic development in Tocharian, where *kʷékʷlos > Toch.A kukäl, Toch.B kokále came to mean ‘wagon, chariot’.

28 December 2015

Echoes of the Distant Past: Fossil Reduplications

Modern English has its normal share of nursery words, colloquial interjections, and miscellaneous other onomatopoeic or expressive words involving sound-repetition: daddy, baby, nanny, sissy, pee-pee, bye-bye, ta-ta, goody-goody, ding-dong, pop, riff-raff, hip-hop, bow-wow, cuckoo, hurdy-gurdy, tic-tac-toe, bubble, giggle, mumble, google, etc. English also has reduplicative words borrowed from other languages: dodo, can-can, dum-dum, yo-yo. Some of such imports are old and their reduplicative status is no longer obvious to non-specialists: barbarian, purple, turtle-dove. A few echoic words exhibiting a repetitive pattern are at least as old as the English language, whatever their ultimate origin; cock and chicken belong here.

Traces left by a reduplication
[Source: Beentree/Wikipedia CC]
Note, however, that the words listed above are not derived by reduplication. For example, giggle cannot be traced back to a simpler verb with only one occurrence of /ɡ/. In the overwhelming majority of cases the repetition is merely phonetic, not morphological. Reduplication in the proper sense of the word (involving a base and an echo) is not used in English to perform any of its typical, cross-linguistically common tasks, such as the formation of plural or collective nouns, verb stems of a particular aspect or tense, intensive verbs or adjectives, deverbal nouns, etc. This is one of those things that make English, together with some other languages of the northerly latitudes, a little weird.

Interestingly, morphological reduplication is given looser rein in some English-based creole languages, for example in Tok Pisin, where it seems to be on the rise as a derivational device  – presumably as a result of contact with the heavily reduplicating indigenous languages of Papua New Guinea. Here are some examples:
kala ‘colour’ → kalakala ‘colourful’
bruk ‘break, fall apart’ → brukbruk ‘fall apart into many small pieces’
pilai ‘play’ → pilaipilai ‘play round’
ron ‘run’ → ronron ‘keep running’
tok ‘talk’ → toktok ‘conversation’
wil ‘wheel’ → wilwil ‘bicycle’¹
Has English preserved any really old reduplications, with cognates in other branches of the Indo-European family? Yes, but there are only a handful left, and most of them show no transparent reduplicative structure any longer. Among those relics there are at least two nouns, wheel and beaver (probably also tetter ‘skin disease’), one adjective, quick (provided that my etymology of PIE *gʷih₃wó- ‘living’ in Gąsiorowski 2007 is correct), and two verbs in the past tense, ate and did. Despite the fact that the two irregular past tenses represent the same modern category, they go back to different Indo-European verb forms, characterised by different reduplication patterns. Perhaps most surprisingly of all, the regular past-tense ending – and not just the -d of loved, watched, waited, but also the -t/-d of kept, brought, sold – vaguely reflects an ancient reduplication as well, and has in fact the same origin as did. I will trace these connections later in this series.


¹ Since wil = Eng. wheel, which itself is an old reduplicated noun, Tok Pisin wilwil is a quadruplication, etymologically speaking.

27 December 2015

How to Stammer Grammatically: Reduplication

Linguistic signs are mostly arbitrary in the sense that their form is not directly related to the concept they express. For example, there is nothing in the phonetic shape of the Malay word ikan to suggest its meaning – ‘fish’, or, by extension, any ‘marine animal’ (turtle, whale, oyster, etc.). The sound of the word is not intended to evoke swimming or splashing. It is just a regular historical reflex of Proto-Austronesian *Sikan (with the same meaning and also an arbitrary phonetic shape). It has cognates in other Austronesian languages, for example Hawaiian i‘a [ˈiʔa]. None of them makes you say to yourself, “Methinks it is like a fish.” Indeed, even if a word starts out as onomatopoeic, sound changes will in the long run alter its pronunciation beyond recognition, eventually reducing or destroying its imitative value (see the etymology of English pigeon).

Affixes and auxiliary words are usually not iconic either. English regularly indicates the plural number of nouns with the suffix -(e)s (pronounced [s, z, z], depending on the context); some nouns (including fish) form endingless plurals. Neither the suffix nor its absence “portrays” plurality, whether by resemblance or by analogy. The same can be said of irregular plurals like goose : geese or child : children. Is it possible at all to express plurality iconically – that is, to make a linguistic sign sound plural? Yes, it can be achieved by amplifying the sign itself to indicate “more of something”; and one simple way to amplify it is to repeat it. Malay nouns are not inflected for number. Plurality, if it matters in a given situation, may be signalled by the use of numerals or quantifiers, or just inferred from the context. But the speaker may also choose to emphasise the multiplicity of referents by doubling the noun: ikan-ikan ‘fish’ (plural). This is similar to emphatic repetition occasionally encountered in all languages, including English, as in:
We rode for miles and miles.
What do you read, my lord? ― Words, words, words.
In English, word repetition is a syntactic phenomenon; in Malay, it is used as a word-formation mechanism. Note, by the way, that many Malay nouns obligatorily consist of a double occurrence of the same sequence and have no simplex counterpart, e.g. biri-biri ‘sheep’ (singular and plural), while others change their meaning if doubled (mata ‘eye’ : mata-mata ‘spy, detective, police officer’). Root-doubling can also be used with adjectives to indicate intensity (her wild, wild eyes could serve as an English analogue), and with verbs to indicate repetitive or prolonged action. In those cases the doubling is definitely iconic. But duplicated verbs may also refer to a sloppy or leisurely execution of an action, e.g. makan ‘eat’ : makan-makan ‘peck at the food’ (showing lack of interest or appetite). Here the iconicity is less self-evident.

The technical term for such morphological doubling is reduplication. In the Malay examples above the entire root is faithfully repeated, but numerous languages also employ partial reduplication in which the repetition is just hinted at rather than applied in full. Typically, a fixed pattern of consonants (C) and vowels (V) is used as a simplified copy of the morphological base – most often a CV or CVC template. Sometimes only the consonants are copied from the base, while the V position is filled by a fixed default vowel (e.g. [ə]).  Depending on the language, the copy may be attached before the base (as a prefix) or after it (as a suffix), or even inserted inside it (as an infix). The copy is usually called reduplicant, but I prefer the handier and less esoteric term echo. We shall be mostly concerned with reduplicative prefixes, that is cases when the echo is placed before the base. For example, in Yucatec Maya CV reduplication is employed to form intensive adjectives and intensive or iterative verbs:
k’aas ‘bad’ : k’a’-k’aas ‘evil’
p’iik ‘break (something hard)’ : p’i’-p’iik ‘break into many fragments’
Partial reduplication of this kind is not unlike stammering, which may also involve incomplete syllable repetition: b—b—black [bəbəˈblæk]. Of course there is an important difference: reduplication is controlled by the speaker, while stammering is involuntary and has no grammatical function. 

Expressing plurality, intensity, repetition or, more generally, “greater degree” is the most natural use of reduplication, with a clear cognitive motivation. However, once adopted as a derivational or inflectional device, reduplication easily acquires secondary functions, gradually dropping its iconic character and evolving into another “arbitrary” morphological tool. Reduplication, in its numerous variants, has a global distribution. It’s only in a circumpolar belt of the northern hemisphere, including Europe, Northern Asia and the northernmost part of North America that reduplication plays little role in derivational and inflectional morphology. From a Eurocentric perspective grammatical reduplication may look exotic; we shall see, however, that it had important functions in Proto-Indo-European and some of the languages descended from it.

17 December 2015

Sex, Greek, and Rix’s Law

A recent comment by David Marjanović made me reflect on a Sanskrit word, yábhati ‘have sexual intercourse’ (as the Monnier-Williams dictionary tactfully puts it). The verb is of special interest to speakers of Slavic languages, because its exact cognate – Proto-Slavic *jebe/o- (with a host of Slavic derivatives) – remains one of the most favourite obscenities in all the languages belonging to that branch of Indo-European. Interestingly, the verb is only very sparsely attested in Iranian and seems to be completely absent from Baltic. In Modern Indo-Aryan its reflexes are quite numerous, though hard to recognise after more than two millennia of sound change, sometimes combined with euphemistic deformation.

By comparing Indo-Iranian and Slavic cognates, we arrive at the stem *jébʰ-e/o- (3sg. *jébʰeti, 3pl. *jébʰonti) as the most parsimonious reconstruction of their ancestral form. It’s a so-called “simple thematic present” – an imperfective stem built to the root *jébʰ-, with the vowel *e in the root and the “buffer vowel” *-e/o- added before personal endings. If the verb has a deeper origin in Indo-European, its oldest form must have been different. Simple thematic presents occur in large numbers in most of the branches of the family; for example, they accout for much of the third conjugation in Latin. However, they are absent from the most outlying lineage of Indo-European (the Anatolian languages), and their low number in Tocharian, the next group that split off before the divergence of the modern branches, shows that they evolved gradually in post-Proto-Indo-European times. Further speculation about the origin of *jébʰ-e/o- via internal reconstruction is difficult because simple thematics have more than one historical sources.

I hope it is not all Greek to you.
[Source: Wikipedia]
When we run out of exact cognates, we can focus on next best thing – plausibly related words with a different morphological structure. Everybody agrees that Ancient Greek oípʰō (with the same meaning) must be a relative of yábhati. Pre-Greek *jébʰ-e/o-, however, would have produced Gk. ˣzepʰō (here, ˣ, not to be confused with the asterisk, marks an unattested, incorrectly predicted form), so the origin of oípʰō must be different. Since the Greek reflex of the root morpheme (oípʰ-) contains an unexpected o, it is justifiable to suspect that one of the Proto-Indo-European “laryngeal” consonants, the one conventionally written *h₃ (probably a voiced pharyngeal fricative [ʕ], if you prefer phonetic symbols) is lurking about. This consonant was vocalised in Greek as o in some positions; it could also (already in PIE) change an adjacent *e into *o. This is why the root we are discussing is often reconstructed as *h₃jébʰ- to accommodate the o-colouring fricative. Unfortunately, most sources just put the laryngeal there and don’t attempt to explain the Greek form in detail.

The trouble is that oípʰō can’t be derived from *h₃jébʰ-e/o- either. According to recent work on PIE syllable structure (Byrd 2015; see also here), the sequence *h₃j- was simplified to *j- in word-initial positions very early in the history of Indo-European, so in this case too we should expect Gk. ˣzepʰō, just as if the *h₃ weren’t there. Some authors propose that *h₃jébʰ-e/o- had a metathetic byform *h₃óibʰ-e/o-, in which *j and *e had swapped places, which caused the latter to get coloured to *o by the preceding *h₃. Such a solution, however, is desperately ad hoc. There is no morphological or phonological motivation for the metathesis, and the wish to see the desired output is not enough.

Another ad hoc solution is adopted in the Lexicon der indogermanischen Verben (Lexicon of Indo-European Verbs, LIV), where the root is listed as *jebʰ-, and its Greek reflex is reconstructed as a present stem with the zero grade of the root and the prefix *o-, that is, *o-ibʰ-e/o-. The problem is that such an alleged verb prefix is vanishingly rare in Greek (so rare that its very reality is questionable), and its function (if any) is unspecified. Solving one mystery by creating another is not sound etymological practice.

A more ingenious suggestion was made by Johnny Cheung in his Etymological Dictionary of the Iranian Verb (2007). Cheung proposes that the Greek present was reduplicated. Grammatical reduplication in PIE involves copying the initial consonant, extending it with the vowel *e or *i, and pasting it back onto the root as a prefix. There are several classes of Indo-European verb stems formed in this way. Following Cheung’s suggestion, we should reconstruct *h₃e-h₃ibʰ-e/o-, which after the laryngeal colouring of the first *e yields *h₃oh₃ibʰ-e/o- and – hey presto – Gk. oípʰō.

Alas, the formation of reduplicated presents is something we understand rather well – well enough to see a couple of problems with this reconstruction. First, although *e may appear as redupllication vowel in IE present stems, it does so only in so-called “athematic” ones (without the *-e/o- suffix). In thematic presents, i-reduplication occurs instead, as in *si-sd-e/o- ‘sit’ > *sizde/o- > Gk. hízō. Secondly, even in athematics, *e seems to have alternated with *i. The details of the alternation are still debated, but one thing is sure: Greek generalised i-reduplication thoroughly in this class, so that we find it in Ancient Greek present stems (thematic and athematic alike) to the complete exclusion of e-reduplication. Therefore, *h₃e-h₃ibʰ-e/o- just won’t float – not in Greek waters.

There remains another possibility, also considered by Cheung but qualified as less likely than the reduplicated root: a zero-grade thematic present, *h₃ibʰ-é/ó-. Such a stem structure is also well-known; one typical example is *gʷr̥h₃-é/ó- ‘devour, swallow’ (Sanskrit giráti, Slavic *žьreti). Both *h₃ibʰ-é/ó- and *(h₃)jébʰ-e/o- (with an early loss of *h₃) could be independently derived from a still older common prototype, most probably a root verb without any suffixes. Why, then, should *h₃e-h₃ibʰ-e/o- be “more likely” than *h₃ibʰ-é/ó- as the source of oípʰō?

The problem here is that we aren’t really sure what happened to initial *h₃i- in the transition from PIE to Ancient Greek. There was a pre-Greek sound change, known as Rix’s Law, which changed any initial *HR̥- into Greek VR-. In these formulae, R stands for any liquid or nasal (l, r, mn), is its syllabic variant, H is any of the three PIE laryngeals, and V is a vowel whose quality matches the phonetic “colour” of the laryngeal (e, a, o for, respectively, *h₁, *h₂, *h₃). To what extent the sequences *Hi- and *Hu- were also affected by Rix’s Law has been a matter of some dispute. PIE *i, *u can be regarded as syllabic variants of the corresponding glides *j and*w; therefore, it is at least thinkable that Rix’s Law could apply to them as well.

As for the sequence *Hi-, however, it can be demonstrated with good examples that no initial vowel developed if the laryngeal was *h₁. It has furthermore been suggested that the outcome could be Gk. hi- (with an initial aspirate) rather than simply *i- (Bozzone 2013). For *h₂ and *h₃ the evidence is inconclusive (no unambiguous examples). But there is no clear counterevidence either to rule out *h₂i- > Gk. ai- or *h₃i- > Gk. oi- (pace Peters 1980¹, who argues for *Hi- > Gk. *i- across the board). As for *Hu-, we have several convincing cases showing that *h₂u- > Gk. au-, one or two possible cases of *h₃u- > Gk. ou-, but no examples at all of *h₁u- > Gk. ˣeu-. This may mean that the Greek reflexes of *h₁u- are indistinguishable from *u- since both merged as Gk. hu-, while the other two laryngeals followed the pattern of Rix’s Law. It is therefore possible that *Hi- and *Hu- developed in parallel, and that the expected outcome of *h₃i- is Gk. oi-.

This insight has far-reaching consequences for our understanding of the various combinations of *i/*j and *u/*w with the laryngeals in the prehistory of Greek, but I can only skim the surface of the topic in a blog post. It’s getting too long anyway, so it’s time for the moral. The hero of this little essay is a swear-word so obscene that some old ladies in my country might faint if they saw it printed in a newspaper. On the other hand, you can hear it all the time in the street, adorned with modifying prefixes, converted into derived nouns, adjectives and adverbs, and spawning lots of specialised meanings. It has functioned like that literally for millennia – taboo or no taboo. Living the merry life of an outlaw, it has become a respectable archaism, almost a living fossil, with an impeccable pedigree and aristocratic Vedic connections. Together with its equally naughty Ancient Greek cousin, it may provide a precious piece of crucial evidence needed to solve a vexing problem in Greek historical phonology. Not bad for a dirty little word.


¹ Martin Peters. 1980. Untersuchungen zur Vertretung der indogermanischen Laryngale im Griechischen. Vienna: Verlag der Österreichischen Akademie der Wissenschaften.