20 January 2016

Ve-Verbs: A Brief Introduction

There were several types of Indo-European verbs formed by reduplication. The most important of them are listed below. Each type is represented by a verb in its 3sg form (of the active voice, where relevant); for glossing purposes, a female subject is assumed:

  • athematic presents with a Ci or Ce echo: *sti-stéh₂-ti [*stistáh₂ti] ‘she’s rising to her feet’ [1];
  • thematic presents with a Ci echo: *sí-sd-e-ti [*sízdeti] ‘she’s taking a seat’;
  • thematic aorists with a Ce echo: *wé-wkʷ-e-t [*wéukʷet] ‘quoth she’;
  • perfects with a Ce echo: *me-món-e ‘she remembers’.

There are also a couple of other reduplicated present types, marked by the use of derivational suffixes. All of them have Ci echoes and are not very different from the second type above:

  • reduplicated sḱe-presents: *dí-dḱ-sḱe-ti [*dítsḱeti] ‘she accepts/learns’;
  • reduplicated desideratives: *wí-wrt-h₁se-ti [*wíwr̥tseti] ‘she wants to turn’.

A famous reduplication (almost too perfect to be true)
Wikimedia Commons

Still other types can be found in some languages of the family but cannot be safely added to the inventory of Proto-Indo-European verb stems because they are are either too poorly attested or too restricted in their distribution. The former is true of athematic reduplicated aorists, and the latter of the Indo-Iranian intensives with “full” reduplication (more precisely, with a CVC echo). Attempts to demonstrate their PIE status have not been successful so far.

I shall begin with the first two types (“underived” reduplicated presents, both athematic and thematic). I’ve already had to mention reduplicated presents in earlier posts. There is some kind of relationship between them and reduplicated nouns, and some of the same issues, like the *e ~ *i alternation in the echo syllable, will be revisited. The exact reconstruction of the reduplicated present is one of the hot problems of Indo-European morphology, not yet settled to everybody’s satisfaction, but important enough for people to keep trying. In the technical literature on the subject, you will  find a variety of proposals which can’t all be correct at the same time. I don’t insist that the analysis I’m advocating is the solution; still, it’s more worthwhile to take the bull by the horns and tackle a vexing question than just to report handbook stuff. Controversy makes for an interesting debate.

One important special problem to be discussed separately is the reduplicated “present” stem [2] of the root *dʰeh₁- ‘put, place’ (plus a dozen or two other meanings it acquired in the early history of Indo-European). Next, I shall discuss the Indo-European perfect, partly because of its importance for understanding the origin of the Germanic “strong” past tense (English sang, drove, bound, etc.).[3] The remaining loose threads will be tied up in the final post of this series.


[1] The reconstruction in square brackets is more phonetic, taking into account the operation of assimilatory processes, syllabification rules, and cluster simplification. The glosses are approximate: the exact shade of meaning produced by the combination of PIE tense, aspect and Aktionsart may be difficult to recover and even more difficult to convey in English.

[2] Why the scare quotes? Because the “present” (imperfective) stem did not occur only in the present tense, and it’s exactly the past-tense indicative of this stem, the so-called “imperfect” of PIE *dʰeh₁-, that played a role in the development of the Germanic verb system.

[3] None of them is reduplicated in Modern English, and few strong preterites remained reduplicated even in Proto-Germanic.

19 January 2016

The Root Question: Why *bʰer-?

The verb root *bʰer- has several paradoxical properties. On the one hand, it’s one of the most securely attested Indo-European roots, documented in Tocharian, Armenian, Greek, Phrygian, Albanian, Indo-Iranian, Balto-Slavic, Germanic, Italic and Celtic. On the other hand, it’s conspicuous by its apparent absence from Anatolian, which means that despite its ubiquity in the rest of the family its Proto-Indo-European status is insecure (but see below on possible Anatolian reflexes). The present stem *bʰér-e/o- is a widespread “simple thematic present”, so familiar as a handbook example that the whole class is often referred to as the *bʰéreti-type.[1] Still, several languages (Latin, Greek, Vedic) show traces of an alternative athematic stem without the *-e/o- suffix – probably a so-called “Narten present” with an underlying long vowel: *bʰḗr-ti, *bʰér-n̥ti). Despite being so common, and despite having such a basic meaning as ‘carry, bear’, the verb lacks some conjugational forms in some Indo-European languages, so that *bʰér- has to team up with other roots to form a complete paradigm. In Latin, for example, the present (ferō), the imperfect (ferēbam) and the infinitive (ferre) are derived from *bʰer-, but the perfect tense (tetulī or tulī) and the perfect passive participle (lātus < *tlātos) are provided by the root *telh₂- ‘lift, raise, support the weight of’. In Greek, we again have reflexes of *bʰer- in the present and the imperfect, while most other forms come from *h₁neḱ- ‘take, acquire’ (and the suppletive future oísō does not even have an established etymology). In Slavic, imperfective *bьrati, 1sg. *berǫ ‘take’ is paired with perfective *ęti, *(j)ьmǫ, from PIE *h₁em- (Lat. emō).

Always collecting stuff...
Photo: Jacek Zięba, CC BY-SA 3.0
The meaning of *bʰer- is quite variable. In many branches its reflexes can be glossed as ‘carry, bear’ (of course English bear is a good example), with connotations of movement rather than static support, and of personal physical effort rather than vehicular transport (in the latter case *weǵʰ- ‘cart, convey’ is used). But the root has developed a large number of secondary senses: ‘take, take up, take away, collect, lift, bring, yield, produce, bear offspring, endure’, etc., and in some branches the core meaning has undergone a considerable semantic shift. Thus, Slavic *berǫ means ‘take’, while *nesǫ from the root *h₁neḱ- (originally ‘take, acquire’) has come to mean ‘carry’ (as if the two roots had swapped meanings). Lithuanian also has nèšti (1sg. nešù) for ‘carry’, but the meaning of Lith. ber̃ti, Latv. bḕrti is ‘scatter’ – so distant from ‘carry’ that doubts have been raised as to whether the Baltic words really derive from *bʰer- (though a development like ‘carry/take around’ > ‘circulate, distribute, disperse’ is quite natural, cf. Latin circumferō).

The oldest reconstructible present, *bʰḗr-/*bʰér- probably meant ‘carry’ in a “telic” sense (as an action with an endpoint: ‘bring or remove by carrying’). The verb gave rise to a root agent noun, *bʰṓr ‘one who takes away’ → ‘thief’ (Latin fūr, Greek pʰṓr). The widespread simple thematic *bʰér-e/o-, which probably originated as the “mediopassive” voice of the original present (with self-benefactive or passive senses), basically inherited its semantics but emphasised the durative shade of the verb and its imperfective character (hence the need to employ some other root to express the perfective and stative aspects).

Vedic alone documents a clear contrast between telic *bʰér(-e/o)- (bhárati, also Rigvedic bhárti) and atelic (iterative, habitual) *bʰi-bʰ(é)r- (bíbharti, Rigvedic bibhárti, 3pl. bíbhrati), but given the fact that CV-reduplicated presents are generally a recessive class of stems in Indo-European, reducing rather than enlarging its membership in the historically known languages, we are probably dealing with an archaism rather than a local innovation.[2] In other words, the distinction between *bʰḗr-/*bʰér- and a reduplicated present (indicating, respectively, events with an endpoint and without one) may be at least as old as the Core Indo-European subfamily. It might even be Proto-Indo-European in the strict sense, assuming that the absence of the root *bʰer- from Anatolian is accidental and due to its having been ousted by (near-)synonyms such as Hittite arnuzi ‘brings, sends, delivers’ or pē-dai ‘carries’.

... and piling it up.
Actually, isolated derivatives of *bʰer- may exist also in Anatolian. The Hittite word for ‘small rodent, mouse’, kapart-, has been etymologised as *ko(m)-bʰr̥-t- ‘gatherer, collector’ (of “stolen” grain).[3] There’s also a possible Lydian cognate, kabrdokid ‘steals’, a verb derived from an abstract noun supposedly meaning ‘hoarding away, stealing’.

The notion of collecting, gathering or bringing together often accompanies the use of *bʰer-. Greek pʰóros (from *bʰór-o-) means ‘earnings, tribute’, and one of the meanings of pʰorā́ (*bʰor-áh₂) is ‘crop’. The abstract noun *bʰr̥-tí- (Ved. bhṛtí- ‘carrying, bringing, support, maintenance’) acquires a concrete meaning in Armenian bard ‘pile, sheaf (of corn)’. Assuming hypothetically that the reduplicated iterative present could form a noun like Hitt. mēmal ‘groats’ (the product of grinding), we might expect *bʰé-bʰr̥ (of perhaps collective *bʰé-bʰōr) ‘the effect of continual collecting, a growing pile’. Like, say, a beaver’s construction – a dam or a lodge. The builder or inhabitant of a *bʰébʰ(o)r- would have been a *bʰébʰros (or possibly *bʰibʰrós, or both; the accent in nominals of this type is hard to predict), and an appropriate epithet referring to the same animal’s prominent behaviour – the assiduous collection and transport of building materials to repair, strengthen and enlarge its constructions – would have been *bʰibʰrús (or *bʰebʰrús) ‘one that’s always gathering stuff’ (timber, twigs, mud, etc.). I think the reduplication makes more sense with the root *bʰer- than with any other similar verb that might refer to something that beavers habitually do. The ability to cut down trees, for example, could be expressed by forming a simple agent noun; iterativity would not need to be emphasised. The male beaver’s legendary defensive stratagem – biting off its testicles and throwing them before hunters – would of course be a one-time trick; and “being brown” is not even eventive, let alone iterative.
So much for beavers, and for the topic of Indo-European nouns showing CV-type reduplication. The next post will be about reduplication in verbs.

[REDUPLICATION: back to the table of contents]


[1] The simple thematic presents arose in the Core Indo-European group and are absent from the Anatolian languages, as far as we know. Only a small number are known from Tocharian; *bʰér-e/o- is one of them.

[2] The alternative iterative stem, *bʰor-éje/o-, is attested only in Greek as pʰoréō ‘carry around, wear, possess (a feature)’.

[3] See Lat. conferō ‘bring together, collect’, and compounds like Vedic iṣu-bhṛ́-t- ‘arrow-carrying’ (describing an archer).

18 January 2016

Towards a More Realistic Beaver

When we consider the known patterns of CV reduplication in Indo-European, we find that different reduplicated adjectives or nouns with very similar meanings can be derived in parallel from  the same verb root. One pair already mentioned is Vedic sásni- (a cákri-type word) : siṣṇú-, both from the verb root *senh₂- ‘gain, strive after, accomplish’. Both adjectives mean, approximately, ‘constantly gaining/winning for oneself or others’. No CV-reduplicated present derived from this root is attested. It forms an Indo-Iranian reduplicated perfect (which, however, expresses a completed action and has no iterative or habitual connotations), and a Vedic “intensive” present with full reduplication (which does mean ‘gain/acquire repeatedly’ but is structurally different from the adjectives in question).[1] It is possible, however, that once a productive derivational schema became established, it was not essential that an actual CV-reduplicated present should exist. E(e)-R(ø)-i- or E(i)-R(ø)-u- adjectives, as well as E(e)-R(ø)-o- (*kʷékʷlo-type) nouns could be formed directly on the basis of a verb root. In one of the Rigvedic hymns to Indra (Book 6, 23:4b) the god is described as follows:
babhrír vájram papíḥ sómaṃ dadír gā́ḥ
carrying the vajra, drinking soma, giving cows
(doing all these things habitually, i.e. whenever he comes to attend a soma-pressing). We have no fewer than three cákri-type quasi-partciples here.[2] Note that they take accusative objects, like the corresponding verbs. And yet, although all the three verbs form CV-reduplicated presents in Vedic, the adjectives can’t be derived directly from those presents. The Vedic present of *bʰer- ‘carry’ (3sg./3pl.) is bi-bhár-ti [3]/bí-bhr-ati with an i-reduplication[4]; from *poh₃(i)- ‘drink’ we have pí-b-a-ti/pí-b-a-nti. At least in the latter case both the i-reduplication and the voiced *b (by assimilation, from the sequence *-ph₃-, with a voiced laryngeal) are very old, at least as old as the common ancestor of Vedic, Latin and the Celtic languages.[5] The adjective papí- seems to have been formed directly to the Indo-Aryan root -/-, using the cákri-type template. The type itself is probably an Indo-Iranian innovation (especially productive in Vedic), inspired by the use of *-i- rather than *-o- as the final vowel in compound stems. The precursor of the cákri-type is essentially identical with the *kʷékʷlo-type (except perhaps for an accentual contrast between nouns and adjectives, if the final accent of bhabhrí- is original and the initial one in cákri- is a Vedic innovation). Therefore the formation represented by bhabhrí- is a reworking of an older type which can be reconstructed as *bʰe-bʰr-ó- ‘(ever-)carrying’ – or, when substantivised, *bʰé-bʰr-o- ‘habitual carrier’. A parallel u-stem with practically the same meaning may also have existed, either *bʰi-bʰr-ú- (like Ved. siṣṇú-) or possibly *bʰe-bʰr-ú- (like Ved. (pari-)tatnú- ‘surrounding’).[6] Thus, both the *Ce- ~ *Ci- variation in the echo and the coexistence of stems in *-o- and *-u- can be explained with recourse to known Indo-European word-forming processes.

Two well-known Indo-European semiaquatic mammals
Conrad Gessner, De piscium et aquatilium animantium natura

But wait a moment: *bʰé-bʰr-o- and *bʰi-bʰr-ú- look exactly like the reconstructed variants of the ‘beaver’ word. If beavers owe their Indo-European name not to their coat colour but to some characteristic habitual activity, the verb describing that activity should be similar to *bʰer- ‘carry’. There are, for example, a couple of known roots of the shape *bʰerH-, one meaning ‘cut, strike, pierce, fight’ (with an unspecified laryngeal) and the other ‘move rapidly, rush, chase’ (in which *H = *h₂ or *h₃). The laryngeal would have been lost in a reduplication containing the root in zero-grade, so we would not be able  to see any difference between the outcomes of *-bʰr- and *-bʰrH-.

Stretching the imagination a little, one would be able to connect the meaning of any of these roots with the beaver’s habits. For example, the first *bʰerH- is glossed ‘mit sharfem Werkzeug bearbeiten’[7] in the LIV; and what are the beaver’s incisors if not “ein sharfes Werkzeug”? Still, I would like to defend the simplest solution, involving the most widespread and most securely reconstructed of these roots, namely *bʰer- ‘carry’. I will justify my preference in the next post. Here, let me only point out that no matter which root we choose, it makes sense to assume that there were more than one related but independently formed variants of the beaver’s name already at a very early stage – at least *bʰébʰros and *bʰibʰrús. It seems that both of them were inherited by languages ancestral to some of the branches of Indo-European. Their visible relatedness, and perhaps the existence in some branches of recognisably related reduplicated verb forms could have produced still more variants through a kind of lexical cross-pollination, hence the attested variation of the echo vowel, the stem class, and the accentuation.


[1] Reduplication in verbs will be discussed in blog posts to come.

[2] They are accented on the stem vowel, unlike cákri- itself, but the accentual variation looks random and is not correlated with any functional difference.

[3] With the root syllable accented in the Rigveda. Later the accent was shifted to the echo syllable: bhíbharti.

[4] When not reduplicated, the Vedic present (bhárati) usually has a telic meaning, i.e. ‘bring’ (a complete one-time activity) rather than ‘carry, bear, wield’.

[5] The original forms were *pí-ph₃-e-ti/*pí-ph₃-o-nti, with the second *p realised as [b].

[6] Cf. Germanic *tetru-, *tetru-ka- (or *titru-ka-?) ‘skin disease, scabies’ (OE teter, Mod.E tetter, OHG zitaroh), Sanskrit dadru-, dadrū (f.) ‘leprosy’, apparently from *der- ‘tear, flay, peel’.

[7] That is, ‘work on (something) with a sharp tool’ – a bit conjecturally, to be sure, since most of the attested meanings suggest the use of a weapon rather than a carpenter’s tool, or are figurative: ‘scold, rebuke’, etc.

12 January 2016

Enter the Beaver

Beaver rampant
Arms of Biberach an der Riß

Arthur Charles Fox-Davies
A Complete Guide to Heraldry (1909)
Wikimedia Commons
Most etymological dictionaries, introductions to Indo-European studies, as well as online sources (including Wikipedia and Wiktionary) inform the reader that the Proto-Indo-European word for ‘beaver’, *bʰébʰrus, is a reduplicate derivative of the root *bʰer- or *bʰreu-, meaning ‘brown’. The same root is often claimed to account for the Germanic ‘bear’ word, *βer-an- (a nasal stem), as if from *bʰer-on- ‘the brown one’. There are several problems with these etymologies.

To begin with, neither *bʰer- nor *bʰreu- is attested as a stem. At best, there are several words in different Indo-European languages which contain reflexes of * and *r (and sometimes of *u) and mean something like ‘brown’; it is, however, hard to connect them formally within a plausible etymon. We can agree that Modern English brown, Modern German braun and Modern French brun (borrowed from Frankish) are “basic colour terms” and can be used to describe the colour of a beaver’s coat. It doesn’t follow, however, that the same can be claimed of their Proto-Germanic ancestor, *βrūna-. In early Germanic languages the word meant ‘dark, swarthy, dusky’ (as well as ‘shiny, bright’, often with reference to forged metal or the sea), and while it could be used to modify virtually any hue for which there was a name, it was hardly a specific colour term itself. Its extra-Germanic connections are anything but secure: although Greek pʰrū́nē (f.), pʰrũnos (m.) ‘toad’ might or might not be cognate, there is no related Greek colour adjective. The “colour conspiracy” of the modern languages of Europe, which have developed identical or very similar basic colour systems, is a case of recent cultural convergence. As late as the seventeenth century, German braun could still refer to hues in the violet/purple range (e.g. the colour of the amethyst).

Modern version of the same
(we know so much more about beavers today).
Lithuanian bė́ras does refer to shades of brown, but is used as a specialised horse-coat term (like English bay), not a generally applicable colour word, and can’t be directly connected with *βrūna- anyway. Vedic babhrú- means ‘deep brown, reddish-brown’ and is practically identical with the reconstructed ‘beaver’ word, but it is probably derived from the animal’s name, not vice versa. The ancient Indo-Aryans had migrated too far from the geographical range of the beaver to have retained the original meaning, but they did keep the derived descriptive adjective.[1] Secondarily substantivised, babhrú- may refer to several rather different animals of India, from the brown mongoose to the Jacobin cuckoo.

The ‘bear’ connection is dubious too. A “weak” (n-stem) noun would presuppose an adjective like *bʰer(o)-, not recoverable as a Proto-Indo-European colour term (even the isolated East Baltic adjective mentioned above isn’t a perfect match), and there is an attractive alternative: the *βer- part can be derived either directly from the root noun *ǵʰwēr-/*ǵʰwer- ‘wild animal, beast’ (Ringe 2006: 106) or more plausibly from the corresponding thematic adjective ‘wild, savage’ (cf. Lat. ferus). To be sure, the hypothesis that word-initial *gʷʰ and *ǵʰw yield Germanic *β remains somewhat controversial (there are a small number of examples), but the etymology of bear as ‘the ferocious one’ is semantically unassailable. The substantivisation of an adjective by turning it into an n-stem is a common morphological process.

Instead of trying to guess in advance what the *-bʰr- part of the beaver’s name stands for, let’s have a look at the full reconstruction first. It’s usually cited as a stem in *-u-, perhaps primarily because of the Sanskrit ‘deep brown’ word, but the total Indo-European evidence is indecisive:

  • In Slavic *bobrъ, *bebrъ, *bьbrъ (note the variation of the echo vowel)[2] the final *-ъ may reflect *-o-s or *-u-s. Some old derivatives and toponyms plus accentual considerations suggest that the word was originally a u-stem in Slavic or perhaps vacillated between the two types, for there’s some evidence supporting an o-stem as well.
  • Baltic shows both u-stem and o-stem forms – the former in Old Prussian bebrus and in the Lithuanian variant bebrùs, the latter in Lith. bẽbras, bãbras, and Latvian bȩbrs.
  • Iranian has an o-stem reflex: Proto-Iranian *babra- > Younger Avestan baβra-, with the variant baβri-; cf. also Pahlavi babrag < *babraka-, with the very productive “colloquialising” suffix *-ka-.
  • Latin has fiber (second declension), as if from *bʰibʰro- (with an i-echo), beside sparsely attested feber.[3]
  • In Celtic, the inherited ‘beaver’ word has been buried under layers of lexical innovations (especially *abankos ‘river animal’) and borrowings. It can be detected in some Gaulish, Old Brittonic and Old Irish toponyms, ethnonyms and personal names, but its exact Proto-Celtic form is difficult to recover: *bebro-, *bebru-, *bibro- and *bibru- possibly coexisted in early Celtic.[4]
  • Finally, the word is excellently preserved in Northwest Germanic. [5] We have e.g. Early Old English bebr, bebir, beber, later befer, befor, beofor; Old High German bibar, bibur; and Old Icelandic bjórr < *bjǫβurr < *beβ(u)raz. All these forms can in principle reflect Proto-Germanic *βeβraz < *bʰebʰro-, though a u-stem can’t be completely ruled out. [Afterthought]

The ‘beaver’ word has a relatively wide attestation, but since the animal itself has occurred mainly at northerly attitudes in historical times, it’s poorly attested in Indo-Iranian and Italic, and not at all in Armenian or in Greek (where we find kástōr instead, borrowed also into modern Albanian). Alas, although beavers lived in parts of ancient Anatolia, we don’t know what the speakers of Hittite or Luwian called them: they weren’t thoughful enough to write something about beavers for posterity. The Germanic and Balto-Slavic languages have preserved the word best, and it’s in Balto-Slavic that we find the greatest diversity of variants.[6] What shall we make of this variety?

I will try to answer this question in the next blog.

[REDUPLICATION: back to the table of contents]


[1] Cf. also Hurrian babrunnu, a technical horsey adjective borrowed from the language of the “Mitanni Indo-Aryans”.

[2] The echo vowel in the modern Slavic languages most often reflects *o (found in all Slavic languages today). The minor variant with *e has a wide but scattered distribution (Serbian Church Slavonic, dialectal Bulgarian, Slovene, Upper Sorbian, Old Russian) and looks like a locally surviving relic (see also the Polish river-name Biebrza, and Romanian breb, borrowed from Slavic). The modern prevalence of *o may be due to a Slavic tendency (inconsistent and poorly understood) to introduce and generalise *o in CV-reduplications. Borrowing is less likely, though Iranian influence has been suspected (as an indication of prehistoric trade in beaverskins and castoreum). Western Lithuanian bãbras seems to be Slavic-influenced. The variant *bьbrъ is rare (Old Russian, Serbo-Croatian dȁbar, with a dissimilated initial stop). It could be regarded as an aberrant local innovation, were it not for the fact that (unlike *bobrъ) it has several exact counterparts in other branches (West Baltic hydronymic *bibru-, Lat. fiber, Celtic *bibru- ~ *bibro-).

[3] Replaced by loanwords (some related to it) in Vulgar Latin and Proto-Romance.

[4] It seems that beavers never colonised Ireland after the last ice age, which of course does not mean that the Irish Celts were unaware of their existence. “Beavery” tribal names could also have been brought to Ireland from Great Britain and/or the continent during prehistoric migrations.

[5] Its absence from the Gothic corpus is due to the usual reason: no beavers in the Bible.

[6] Some East Slavic dialects preserve a uniquely specialised, evidently archaic word for ‘beaver lodge’, *zer(d)mę < *gʰerdʰ-mn̥, with a curious “hyper-satem” treatment of the root *gʰerdʰ- ‘gird, encircle, fence about’, cf. Slavic *gordъ ‘fort, town’, Lith. gar̃das ‘enclosure, stall’, Vedic gr̥há- ‘house’, Albanian gardh ‘fence’. The ‘beaver lodge’ word has been borrowed into standard Polish as żeremie (with a hypercorrect ż).

08 January 2016

Setting the Scene for the Beaver

In the Proto-Indo-European derivational system, adjectives in *-ó- were rarely formed directly from root nouns in zero-grade (often simply identical with verb roots). Thus, they differed from some other adjectival derivatives involving more complex suffixes, e.g. deverbal adjectives in *--, *--, *--. When just the bare thematic vowel *-ó- was added, the zero-grade was usually “reinforced”  – in the simplest case, by inserting a full vowel, *e, somewhere inside the root (not necessarily in the “correct” place, that is, not always faitfully restoring the original e-grade). This, however, did not have to happen if the root was the second member of a compound. Since reduplications behave in many respects  like compounds (namely, like a root compounded with itself), it is possible that nouns of the *kʷékʷlo-type should be traced back to adjectives like *kʷe-kʷl-ó- ‘revolving’, and these in turn to reduplicated root nouns like *kʷé-kʷ(o)lh₁-, expressing the action itself or its product (in this case, either ‘circular movement’ or ‘circle, cycle’).

In fact, such nouns are not purely conjectural. A few have left tangible reflexes in historically known languages. For example, Hittite mēmal ‘groats’ is an athematic neuter noun derived from *melh₂- ‘grind’ by means of CV-reduplication: *mé-ml̥h₂. In theory, a *kʷékʷlo-type noun could easily be formed via thematicisation and accent retraction: *mé-ml-o- ‘something used in groat production’ (e.g. ‘quern, millstone’); it just happens not to be attested (unless Armenian mamul ‘press’ is somehow derivable from it). As for reduplicated adjectives, Vedic examples such as vavrá- ‘hiding, concealing oneself’ and sasrá- ‘streaming’ can be quoted (the roots in question are, respectively, *wer- ‘cover, protect’ and *ser- ‘flow’).

There are, however, other reduplication types, also based on verb roots but harder to fit into the pattern proposed above. Superficially, they have the same structure: E(V₁)-R(ø)-V₂-, where R(ø) is a verb root in zero-grade. However, V₁ is *i rather than *e, or V₂ is a high vowel (*i or *u) rather than *o; note only that V₁ and V₂ can’t both be *i at the same time. Here are a few characteristic examples from Vedic (where such reduplications are particularly well represented):

Vedic word
PIE root
vavrí- ‘hiding-place’
*wer- ‘cover’
cákri- ‘active, making’
*kʷer- ‘cut, shape’
babhrí- ‘carrying’
*bʰer- ‘carry’
sásni- ‘gaining repeatedly’
*senh₂- ‘gain’
siṣṇú- ‘ever-securing’
*senh₂- ‘gain’
jigyú- ‘victorious’
*gʷei- ‘compel’
(pari-)tatnú- ‘surrounding’
*tenh₂- ‘stretch’

Agni (with partial reduplication)
Most of these virtual “protoforms” are not likely to be of Proto-Indo-European date; they only illustrate the operation of the derivational mechanism. Indo-European i-stems were typically nouns (often with an agentive meaning) derived from o-stem adjectives. There were also adjectival compounds in which the second member was an i-stem corresponding to a thematic noun (with *-o- replaced by *-i- ).[1] Both processes seem to have affected some of the reduplications above. On the one hand, we have vavrá- (adj.) : vavrí- (noun); on the other, cákri- (adjective) looks as if it had originally corresponded to a noun of of the *kʷékʷlo-type, and acquired its *-i- by conforming to the productive pattern of compound adjectives (as pointed out above, reduplications are compound-like structures).

But what about adjectives like jigyú- ‘victorious’? In their case, derivation from a *kʷékʷlo-type noun does not seem to be possible. Note that we have an adjectival doublet, sásni- ~ siṣṇú- [2], both derived from the same, widely distributed Proto-Indo-European root, but apparently in different ways.

As opposed to “second generation” adjectives with stems ending in *-i-, u-stem adjectives are a very old type. Some of them can be found on any list of basic Proto-Indo-European vocabulary. In some cases the root to which the *-u- is added is simply adjectival (meaning that it has no other known functions); but it may also be a recognisable verb root. In the last common ancestor of the Indo-European languages the root normally had zero-grade, and the suffix was accented: R(ø)-ú-. Here are a few typical examples: *tn̥h₂-ú- ‘thin’, *pl̥h₁-ú- ‘much, many’, *h₁s-ú- ‘good’, *mr̥ǵʰ-ú- ‘short, brief’, *gʷr̥h₂-ú- ‘heavy’, *h₁ln̥g(ʷ)ʰ-ú- ‘light, nimble, quick’. In terms of function, this *-ú- is almost equivalent to the suffix *--, also found in many common adjectives (and often transparently deverbal), e.g. *h₁rudʰ-ró- ‘red’, *h₂r̥ǵ-ró- ‘flashing, swift’, etc. There are even occasional pairs of (near-)synonyms: *h₁ln̥g(ʷ)ʰ-ú- ≈ *h₁ln̥gʷʰ-ró- (from the verb root *h₁lengʷʰ- ‘move briskly’).[3] One important difference between the two types is that *-- adjectives do not occur in old compounds. We may therefore presume that if a “first generation” deverbal adjectve was formed from a reduplicated verb, *-- was ruled out  and *-ú- was the remaining option.

The frequent occurrence of *-i- in the echo syllable of u-stem reduplications may have something to do with the fact that *-ú- is normally added to an ablauting base in zero-grade. Perhaps *-i- was once treated as a weak allomorph of full-grade *-e-.[4] The recipe for a reduplicated u-stem adjective is therefore as follows: take a root (e.g. *senh₂-), reduplicate it using a CV template (*se-senh₂-), make it weak (*si-sn-), and add *-ú- (*sisnú-). Serve in a Vedic hymn to Agni the Bounteous (siṣṇú-).

In an earlier article (2007), I analyse the aberrant verb *gʷíh₃w-e/o- ‘live’ and the related adjective *gʷih₃w-ó- ‘living, alive’ as ancient reduplications: *gʷi-h₃w-ó-(from pre-PIE *gʷi-gʷw-ó-) has retained an archaic weak vowel of the echo because its reduplicative structure became obscured very early and protected from any kind of analogical “repair”. Redupilcations in *-ú- are similar to those in *-ó-. However, u-stems are more likely to retain their adjectival character, while o-stems can easily be substantivised by means of accent retraction (so that “second generation” cákri-type adjectives must sometimes be generated to replace their lost thematic ancestors.

I am awfully sorry if the discussion above seems too technical, but I shall need to refer to this formal background when presenting the hero of the next post (to appear during the weekend) – the Proto-Indo-European word for ‘beaver’. I was actually planning to deal with beavers today, but I realised that some complicated stuff had better be clarified beforehand.

[REDUPLICATION: back to the table of contents]


[1] Cf. Lat. inermis < *n̥-h₂armi- ‘unarmed’ (literally ‘[having] no-weapon’) vs. arma ‘arms’ (an o-stem plural).

[2] To be sure, siṣṇú- occurs only once in the Rigveda (Book 8, 19:31) as an epithet of Agni.

[3] Gk. elakʰús ‘small’, elapʰrós ‘light, quick, small’. Note that labiovelar stops regularly lost their labial component before *u/*w already in Proto-Indo-European.

[4] Cf. the realisation of unstressed etymological /e/ as a high vowel [ᵻ] in many English words, including obscured compounds (as in the traditional pronunciation of forehead, to rhyme with horrid).

05 January 2016

A Reduplication Manual for Drivers, Metalworkers, and Birdwatchers

The case of *kʷékʷlo- is relatively clear, perhaps because the word is not as old as some other Indo-European vocabulary, and its derivation from a verb root is at least semi-transparent. The same goes for several other reduplications with the same structure (echo + known root in zero-grade + thematic vowel). Unlike *kʷékʷlo-, they are less securely attested, appearing only in one or two branches of Indo-European. For example, Latin aurum ‘gold’ has a nice Baltic cognate, Lith áuksas (dialectal áusas).  A careful comparison of these words, taking into acount the Baltic tonal accent, leads to the reconstruction *h₂áh₂uso- (= *h₂é-h₂us-o-, originally neuter, as in Latin) for the common ancestor of Italic and Baltic.[1] The verb root here is *h₂wes- ‘light up, dawn’. Gold is therefore “that which glitters”, and we can speculate that the reduplication symbolically suggests the intermittent flickering of reflected light.

Another interesting example, this time restricted to Baltic and Slavic, is the word for ‘cuckoo’: Lith. gegužė̃, Old Russian zegzicažеgъzuľa (< Proto-Slavic *žеgъzа, extended with various suffixes).[2] The ancestral Balto-Slavic form can be reconstructed as *geguźaH. Because of its tongue-twisting, repetitious consonant pattern the word is often “explained” as onomatopoeic, though it bears little actual resemblance to any noise characteristic of the common cuckoo. If, however, we try to analyse it as a *kʷékʷlo-type reduplication and travel back in time beyond Proto-Balto-Slavic , its structure becomes clearer: *gʰegʰuǵʰah₂ (= *gʰe-gʰuǵʰ-e-h₂), a feminine noun based on the verb root *gʰeuǵʰ- ‘hide’ (known from Indo-Iranian and Baltic). I’ll leave open the question whether the Balto-Slavs dubbed the cuckoo “the hiding one” because it is notorious for hiding its eggs in other birds’ nests, or because it’s frequently heard but almost never seen:

          No bird, but an invisible thing,
          A voice, a mystery. [3]

Both explanations make enough sense to justify the etymology.[4]

A strange reduplication hiding in a reed warbler’s nest.
Let us use the following shorthand notation: R is a root morpheme and E is its reduplicative echo (filling a CV template). The vocalism of a morpheme will be shown in brackets: R(ø) means a root in zero-grade (without a vowel), E(e) means an echo in e-grade, and E(é) means that the e-grade echo is accented. A *kʷékʷlo-type derivative can be defined as follows: E(é)-R(ø)-o-. The accent was shifted to the thematic ending when an o-stem formed a collective, and it’s possible that the vowel of the echo was phonetically weakened as a result, yielding E(ə)-R(ø)-é-h₂,  but this was a superficial change, easy to reverse by analogically restoring the full vowel of the singular. Note that while ‘wheels’ often occur as a specific set, ‘gold’ is an uncountable mass noun and ‘cuckoos’ do not form natural assemblies. Therefore, of the reduplicated words discussed so far, only ‘wheel’ had a frequently used collective form.

The “thematic” suffix *-e/o- was very productive in the formation of adjectival derivatives. Proto-Indo-European adjectives had the same patterns of declension, and adjectives could be substantivised (converted into nouns) by so-called “internal derivation”: not by suffixation, but by modifying the accent or vocalism. Many thematic nouns must have originated as adjectives turned into nouns simply by retracting  their accent from the thematic vowel. If this happened after the period of dramatic vowel changes – strongly sensitive to the location of stress – which produced the Indo-European ablaut patterns, the accent retraction did not affect the vocalism of the adjective. Some nouns retracted their accent just because they were nouns, even if thy had no adjectival counterpart.[5] When such a derivational pattern became productive, reduplicated thematic nouns could be formed from verb roots directly, skipping the adjectival stage, but the meaning of the noun was still “descriptive”, referring to a characteristic action carried out in an iterative or frequentative manner (“turning round and round”, “flashing again or again”, “hiding habitually”).

Since we have examples of E(é)-R(ø)-o- from both Anatolian and Core Indo-European, the type must have originated already in Proto-Indo-European; but given the limited distribution of most of the nouns formed in this way, they were probably coined at the “dialectal” stage, when the descendants of Proto-Indo-European were already diverging into distinct languages. Younger derivatives are typically more transparent and less irregular than old ones – which is precisely what makes their derivation productive. In the next post, I shall try to argue that an older layer of reduplicated nouns, less transparent and harder to analyse, can also be identified.

[REDUPLICATION: back to the table of contents]


[1] Michiel C. Driessen. 2003. *h₂é-h₂us-o-, the Proto-Indo-European term for “gold”. Journal of Indo-European Studies 31, 347–362.

[2] The accumulation of stops and fricatives makes the word prone to phonological distortion, so the ancestral sequence ž…gz may change into ž…gžz…gzz…z or gž…gž by assimilation at a distance. The Modern Polish word for ‘cuckoo’ is kukułka (an onomatopoeic innovation with a different kind of reduplication), but conservative forms, such as zazula and gżegżółka, have survived in regional dialects. Most Poles remember the funny-looking word gżegżółka (pronounced [gʐɛˈgʐuwka]) from the classroom spelling-tests they were tortured with in their school days. They may dimly recall that it referred to some sort of bird. Schoolteachers, however, rarely explain which bird it is – they probably aren’t sure themselves. Gżegżółka has therefore become  an interesting example of a word which has no real communicative function but has been co-opted for a marginal social use (testing schoolchildren’s orthographic memory). See zyzzyva for a similar phenomenon.

[3] William Wordsworth, To the Cuckoo.

[4] The idea is mine, but you are free to share it as long as you credit your source. It has the additional advantage of making it possible to connect Balto-Slavic *geguźaH with the Proto-Germanic word for ‘cuckoo’, *ɣaukaz, on solid formal grounds. The similarity between them has been noticed before, but mere similarity means little without a detailed morphological analysis.

[5] Note that many PIE thematic nouns were accented on a zero-grade syllable, e.g. *h₂ŕ̥tḱos ‘bear’ or *wĺ̥kʷos ‘wolf’.

02 January 2016

Germanic Wheels: Non-Linear Evolution

As we have seen, the effects of the accent shift accompanying the formation of Indo-European collectives were levelled out in Greek and Vedic. Note that such analogical regularisation happens when speakers find it difficult to make sense of the forms they are exposed to. Ancient alternations lose their productivity and become obscured by accumulated layers of sound change. If the outcome survives, it lingers on as a grammatical irregularity. If the whole speech community gets rid of it, the evidence that could be used to reconstruct the original alternation is lost. Fortunately for historical linguists, speakers are not very consistent in “repairing” the irregularities of their language. For example, in the prehistory of Greek the accent and the vocalism of the singular and the collective of ‘wheel’ were levelled out. Nevertheless, speakers didn’t mess with the inherited gender of the word: kúklos remained a masculine despite having a neuter-like plural. We can imagine that a different language could change the gender of the word but preserve clear traces of the accent alternation. And that indeed is what Germanic has done.

The best-know features distinguishing the Germanic languages from the rest of Indo-European are the consequences of two regular sound changes which operated in the common ancestor of the group (Proto-Germanic): Grimm’s Law and Verner’s Law. Grimm’s Law affected all the inherited Indo-European stops, changing their phonation type (voicing) or manner of articulation. The pre-Germanic voiceless stops *p, *t, *k, * became voiceless fricatives with the same or similar place of articulation: *f, *þ, *x, *. At roughly the same time the inherited “voiced aspirated” stops *, *, *, *gʷʰ shifted into corresponding voiced fricatives: *β, *ð, *ɣ, *ɣʷ. A little later, the third part of Grimm’s Law was enacted: the remaining inherited stops *b, *d, *g, * became devoiced, yielding Germanic *p, *t, *k, *. As a result, Proto-Germanic changed from a language with a large number of stop phonemes into a language with a rich system of fricatives.

Verner’s Law applied to non-initial voiceless fricatives not adjacent to another voiceless sound and preceded by an unaccented syllable. The fricatives affected were either those generated by Grimm’s Law, or the only fricative phoneme inherited from pre-Germanic times, *s (the Proto-Indo-European “laryngeal” fricatives had already disappeared). As a result, *f, *þ, *s, *x, *  became *β, *ð, *z, *ɣ, *ɣʷ in the appropriate environment.

From *kʷékʷlos to wheel? It wasnt that simple.

Let us see how these changes affected PIE *kʷékʷlos.

Grimm’s Law applied to both occurrences of *, changing them into *. Since neither of them was found in an environment triggering Verner’s Law, they remained unchanged till the end of Proto-Germanic. Verner’s Law would have affected the final *-s. but we can’t be sure it was there. The ‘wheel’ word is a neuter noun in Northwest Germanic. We don’t know if it survived in Gothic, the only East Germanic language known from written texts. Most of the preserved Gothic material consists of copies of one partial translation of the Bible, and the text doesn’t happen to mention wheels. It’s obvious that the shift from masculine to neuter in the singular noun took place because the plural looked neuter, but we can’t tell whether it happened in pre-Germanic, Proto-Germanic or the common ancestor of the Northwest Germanic languages. Let us therefore ignore the gendered nom.sg. ending *-s and focus on the stem *kʷékʷlo-, which was the same for neuters and masculines. In the passage from pre-Germanic to Proto-Germanic, *kʷékʷlo- > *xʷéxʷla-.[1]

What happened to the the collective *kʷəkʷláh₂? The laryngeal in the ending was lost long before Grimm’s Law, and the vowel was lengthened by compensation. It seems that a full vowel was restored early in the initial syllable on the analogy of the singular, so we may start with the form *kʷekʷlā́, serving as the plural of *kʷékʷlo- (whatever the latter’s gender). In Proto-Germanic, *kʷekʷlā́ became *xʷexʷlṓ by Grimm’s Law. Since the second * occurred in a voiced environment after an unaccented vowel, Verner’s Law applied, yielding *xʷeɣʷlṓ as the plural of *xʷéxʷla-.

A few more developments took place before the split of Proto-Germanic into the East and Northwest groups. First, Proto-Germanic gave up contrastive accent in favour of fixed initial stress. This means that although linguists can sometimes infer the original location of the word accent from the outcome of Verner’s Law, both *xʷexʷla- and *xʷeɣʷlō had predictable initial stress in late Proto-Germanic, and the speakers of the language had no means of guessing where the */*ɣʷ alternation came from. The occurrence of “non-Vernerian” and “Vernerian” variants no longer depended on stress-related factors. For historical reasons, they were found tendentially in different grammatical forms, so speakers came to regard the conditioning as morphological, not phonological. But since grammatical contrasts are in most cases sufficiently signalled by other means (e.g. the use of inflectional endings), the cost of maintaining an obscure consonantal alternation may outweigh its functional importance.

Although the Germanic languages entered the historical scene rather late (in comparison with Hittite, Greek, Vedic or even Latin), they preserved some remarkably conservative features. The accent shift distinguishing some singular thematic nouns (with stems ending in *-o-) from their plurals (original collectives) was one of them. But the establishment of an initial-stress rule sounded the death knell of the distinction. The voicing alternation in words containing medial fricatives was not enough to keep it alive. Speakers of Late Proto-Germanic eliminated most of the Vernerian alternations from the noun system, generalising one of the variants at the expense of the other. In the comparative material we can see only some scattered fossils instead of a productive pattern.

Levelling out could happen either way. Some speakers generalised the consonant of the singular (*xʷexʷla-/*xʷexʷlō ), and others that of the plural (*xʷeɣʷla-/*xʷeɣʷlō). In the Proto-Germanic speech community the basic form of the noun was effectively duplicated: it could be either *xʷexʷla- or *xʷeɣʷla-, both meaning ‘wheel’ and both occurring with the same case-endings.

Still before the breakup of Proto-Germanic, the status of labiovelar consonants became precarious. The voiced labiovelar fricative *ɣʷ was eliminated from most positions; word-medially it merged with the semivowel *w. Voiceless* lost its labial accompaniment (lip-rounding) before consonants. The result was like this: *xʷexla- ~ *xʷewla-. The correspondence between these variants was anomalous, since the normal “Vernerian” counterpart of non-labialised *x  was *ɣ, not *w. This must have caused occasional transmission errors: *xʷewla- could be misheard and misinterpreted as *xʷeɣla- (by listeners who anticipated the voiced counterpart of *x). Thus, by the end of the Proto-Germanic period three variants of the stem were in circulation: *xʷexla- ~ *xʷewla- ~ *xʷeɣla-.

The resolution of a conflict between competing synonyms doesn’t happen overnight. In fact, it can take centuries unless speakers have a good reason to prefer one of the forms. In the case we are discussing, however, none of the competitors had a decisive advantage over the others, so their evolution proceeded in a “neutral” fashion. If there is no systematic bias, the relative frequencies of variants will vary randomly until the least lucky one drops out of use and is seen no more. But the three forms survived into the languages descended from Proto-Northwest Germanic before any of them reached fixation in the speech community. To be precise, we can identify reflexes of *xʷexla- and *xʷewla- in North Germanic, while all three can be found in West Germanic.

Let us now adjust the notations slightly to catch up with the phonological developments in the West Germanic languages. In their parent language, the articulation of *x was weakened in most positions, so that a glottal aspirate [h] beacame its default pronunciation, with velar or palatal fricatives remaining as positional variants determined by the context. I will therefore use the transcription *h rather than *x for this evolvoing phoneme. At that stage the surviving labiovelars were treated not as single phonemes but as sequences of two segments, *hw and *kw (*gw was at best rare; it may have become *g by that time).

We can thus assume the existence of three variants in Early Proto-West Germanic: *hwehla-, *hwewla-, and *hweɣla-. In West Germanic there was a tendency for consonants to undergo gemination (or, in plain English, doubling) when followed by *j or by one of the liquid consonants, *r or *l. Before *j the doubling was regular and affected all consonants except *z and *r (which soon merged as *r). Before *r and *l, it was sporadic and restricted to non-sonorants. Some speakers doubled the second *h of *hwehla-, so a new pronunciation,*hwehhla-, was added to the already existing pool of variants.[2]  At a later stage of Proto-West Germanic the stem dropped its final vowel in the nominative/accusative singular. The four variants competing at that time were as follows: *hwehl, *hwehhl, *hwewl, and *hweɣl.

Soon afterwards, the West Germanic-speaking Angles, Saxons and Jutes embarked on their conquest of Britain. A few centuries later Old English began to be written down regularly. Which ancestral forms survived into literary Old English? The answer is quite surprising: none had been eliminated. Descendants of all the West Germanic variants can be identified in the Old English corpus:
  • hwēol ~ hwīol < *hwehl and *hwewl (from both sources)
  • hweohhol (hweohl- when inflected) < *hwehhl
  • hweowol ~ hweowul ~ hweowel (hweowl-) < *hwewl
  • hweogul ~ hweogel (hweogl-)[3] < *hweɣl
To be sure, their relative frequency was non-uniform; hwēol was by far the most common form, followed by hweow(V)l (which was about half as frequent), with the others lagging behind; but the competition was by no means over yet.

In Middle English times (11th-15th c.) this variety was drastically reduced. The variant whẹ̄l /hweːl/ (from OE hwēol) increased its frequency at the expense of all alternative forms, ousting them almost completely.[4] The variants whewel, wheghel, and even whefyl (apparently with with /f/ from /x/, as in laughter an enough) lingered on for some time, but remained vanishingly rare and dialectally restricted. Their last remaining traces can be found in proper names, for example in the surname Whewell. You must have heard of William Whewell (1794–1866), the scholar who coined the words scientist and physicist, but you probably didn’t know his name was a fancy variant of wheel.

As you can well imagine, similarly complicated stories could be spun to present the evolution of the ‘wheel’ word and its variants in other Northwest Germanic languages. There are numerous interesting problems that I can’t discuss here for want of space. For example: what happened to Germanic *xʷexʷla- in High German? where did the odd-looking Old Frisian variants fiāl and t(h)iāl come from?  Well, I have to stop somewhere. There are other words waiting to be discussed.

[REDUPLICATION: back to the table of contents]


[1] The mergers *o, *a > *a (for short vowels) and *ō, *ā > *ō (for long vowels) are also characteristically Germanic.

[2] There was a phonetic difference between medial *-h- and *-hh- surrounded by vowels or sonorants. The former underwent gradual weakening into a half-voiced glottal glide [ɦ] and was eventually dropped in the individual histories of the West Germanic languages, while  *-hh- retained a strong velar articulation [xx] and survived much longer.

[3] OE g was still a voiced velar fricative, [ɣ], in this context.

[4] It was spelt in about twenty different ways, which however indicate more or less the same pronunciation. Rarer dialectal forms with Middle English /iː/ existed as well. The vowel of Modern English wheel /(h)wiːl/, however, comes from Middle English /eː/ via the Great Vowel Shift.