23 February 2016

The Strange Case of the Jumbled Vowels

Though scattered traces of athematic reduplicated presents can be found in several branches of Indo-European, it’s only Indo-Iranian and Greek that preserve them well enough to enable reconstruction. Indo-Iranian evidence is especially important, since that branch seems to distinguish two types reduplicated presents, one with *e and the other with *i as the echo vowel. Moreover, the ablaut (vowel alternations) in the conjugation of reduplicated presents can be seen there more clearly than in Greek.

  • Vedic bábhasti, bápsati ‘chew, devour’, as if from *bʰe-bʰes-ti, *bʰe-bʰs-n̥ti [1]; 
  • Vedic jígāti, jígati ‘go’, as if from *gʷi-gʷah₂-ti, *gʷi-gʷh₂-n̥ti.

Some Indo-Europeanists believe that the two types are inherited and their coexistence in Indo-Aryan is an archaism rather than an innovation. In the LIV (p. 16) [2] they are reconstructed with different PIE vowel grades and accent patterns:

  • Type 1: *dʰé-dʰoh₁-/*dʰé-dʰh₁- (root *dʰeh₁- ‘put, place’); 
  • Type 2: *sti-stéh₂- [*stistáh₂-]/*sti-sth₂- (root *steh₂- ‘stand’).

In Greek, on the other hand, the echo vowel is invariably *i, and the root vowel (when accented, as in the singular) is always a reflex of *e. Note the characteristic triad of examples (three very common verb roots, each with a different laryngeal):

  • Greek títʰēmi ‘I put’, as if from *dʰi-dʰeh₁-mi;
  • Greek hístēmi ‘I cause to stand’, as if from *s(t)i-steh₂-mi [*sistah₂mi];
  • Greek dídōmi ‘I give’, as if from *di-deh₃-mi [*didoh₃mi].

Type1 and Type 2
Wikimedia Commons
It seems that Type 1 disappeared completely in the prehistory of Greek and all verbs originally belonging to it were absorbed by Type 2. The o-grade reconstructed in the LIV for Type 1 is not directly confirmed by Indo-Iranian evidence (all non-high vowels merged as /a/ there); it is inferred from rather complex assumptions about Proto-Indo-European vocalism. The only fact cited in its support is the anomalous o-grade present of Germanic *ðō- ‘do’ (found only in West Germanic). The idea that it represents dereduplicated *dʰé-dʰoh₁- inherited from Proto-Indo-European is hard to reconcile with our understanding of other reflexes of genuinely reduplicated *dʰeh₁- in Germanic (as we shall see).

Relics of reduplicated presents derived from *dʰeh₁- and *deh₃- [*doh₃-] can also be found in Balto-Slavic. The former had e-reduplication there, as shown by Lithuanian dẽda (3sg.) ‘lay, put’ and Old Church Slavonic deždǫ (1sg.) ‘put’ (< Proto-Slavic *de-d-je/o-, transferred to the *-je/o- conjugation). The latter, curiously, is reduplicated with Balto-Slavic *ō, as in Lith. dúodu, OCS damь (< *dad-mь*dōd-mi, with athematic inflections). This *ō reflects earlier short *o, lengthened before non-aspirated *d (Winter’s Law). We can therefore reconstruct parallel reduplicated stems at an earlier stage of the Balto-Slavic parent language: *dʰe-dʰ- ‘put’, *do-d- ‘give’.

It’s clear that the “weak” form of the stem (with the root in zero-grade) was generalised in each case, but why are the echo vowels different? The most parsimionious explanation is that *dʰe-dʰ- is a straightforward reflex of PIE *dʰe-dʰeh₁-/*dʰe-dʰh₁- (levelled out in favour of the weak variant), whereas in the Balto-Slavic descendant of PIE *de-deh₃- [*dedoh₃-]/*de-dh₃- the echo vowel was assimilated to the laryngeally coloured root vowel of the “strong” stem (*dedoh₃- > *dodoh₃-). Subsequently, this new pronunciation was generalised across the paradigm (*dedh₃- > *dodh₃- > Proto-Balto-Slavic *dōd-), and only the weak variant survived into historical times. For this hypothesis to work, it is necessary to assume that the original strong vocalism of the reduplicated present of *dʰeh₁- was *e, not *o; otherwise it would also display the echo-vowel assimilation visible only in *dōd- ‘give’.

It seems reasonable to conclude that Type 1 and Type 2 differed much less than the LIV reconstruction suggests. The ablaut pattern of the root syllable seems to be the same in both types; the only significant difference between them concerns the choice of the echo vowel. This is how the two types are reconstructed e.g. by Don Ringe (2006: 28)[2]:

  • Type 1: *dʰé-dʰeh₁-/*dʰé-dʰh₁-;
  • Type 2: *stí-steh₂-/*stí-sth₂-.

Note the fixed accent on the echo syllable, consistent with most of the comparative evidence. On the other hand, this reconstruction doesn’t tell us why the root syllable alternates between e-grade and zero-grade. Nor does it help to account for the different echo vowels. Is the occurrence of e-reduplication beside i-reduplication just a messy fact of life, or are we missing something?

The two reconstructions can’t both be right, although they can both be wrong. I actually believe that neither of them is correct, and I’ll try to justify my opinion in the next post.


[1] The forms cited here are 3sg. and 3pl. The sequence *bʰs must have developed into something like Proto-Indo-Iranian *bzʰ as a result of progressive breathy-voice assimilation (Bartholomae’s Law). Although it ended up as voiceless [ps] in Vedic, the aspiration survived long enough to trigger the deaspiration of the initial consonant of bápsati by Grassmann’s Law.

[2] Helmut Rix, Martin Kümmel et al. 2001. Lexikon der indogermanischen Verben (2nd edition). Wiesbaden: Dr. Ludwig Reichert Verlag.

[3] Don Ringe. 2006. A linguistic history of English. Vol. 1: From Proto-Indo-European to Proto-Germanic. Oxford: Oxford University Press.

20 January 2016

Ve-Verbs: A Brief Introduction

There were several types of Indo-European verbs formed by reduplication. The most important of them are listed below. Each type is represented by a verb in its 3sg form (of the active voice, where relevant); for glossing purposes, a female subject is assumed:

  • athematic presents with a Ci or Ce echo: *sti-stéh₂-ti [*stistáh₂ti] ‘she’s rising to her feet’ [1];
  • thematic presents with a Ci echo: *sí-sd-e-ti [*sízdeti] ‘she’s taking a seat’;
  • thematic aorists with a Ce echo: *wé-wkʷ-e-t [*wéukʷet] ‘quoth she’;
  • perfects with a Ce echo: *me-món-e ‘she remembers’.

There are also a couple of other reduplicated present types, marked by the use of derivational suffixes. All of them have Ci echoes and are not very different from the second type above:

  • reduplicated sḱe-presents: *dí-dḱ-sḱe-ti [*dítsḱeti] ‘she accepts/learns’;
  • reduplicated desideratives: *wí-wrt-h₁se-ti [*wíwr̥tseti] ‘she wants to turn’.

A famous reduplication (almost too perfect to be true)
Wikimedia Commons

Still other types can be found in some languages of the family but cannot be safely added to the inventory of Proto-Indo-European verb stems because they are are either too poorly attested or too restricted in their distribution. The former is true of athematic reduplicated aorists, and the latter of the Indo-Iranian intensives with “full” reduplication (more precisely, with a CVC echo). Attempts to demonstrate their PIE status have not been successful so far.

I shall begin with the first two types (“underived” reduplicated presents, both athematic and thematic). I’ve already had to mention reduplicated presents in earlier posts. There is some kind of relationship between them and reduplicated nouns, and some of the same issues, like the *e ~ *i alternation in the echo syllable, will be revisited. The exact reconstruction of the reduplicated present is one of the hot problems of Indo-European morphology, not yet settled to everybody’s satisfaction, but important enough for people to keep trying. In the technical literature on the subject, you will  find a variety of proposals which can’t all be correct at the same time. I don’t insist that the analysis I’m advocating is the solution; still, it’s more worthwhile to take the bull by the horns and tackle a vexing question than just to report handbook stuff. Controversy makes for an interesting debate.

One important special problem to be discussed separately is the reduplicated “present” stem [2] of the root *dʰeh₁- ‘put, place’ (plus a dozen or two other meanings it acquired in the early history of Indo-European). Next, I shall discuss the Indo-European perfect, partly because of its importance for understanding the origin of the Germanic “strong” past tense (English sang, drove, bound, etc.).[3] The remaining loose threads will be tied up in the final post of this series.


[1] The reconstruction in square brackets is more phonetic, taking into account the operation of assimilatory processes, syllabification rules, and cluster simplification. The glosses are approximate: the exact shade of meaning produced by the combination of PIE tense, aspect and Aktionsart may be difficult to recover and even more difficult to convey in English.

[2] Why the scare quotes? Because the “present” (imperfective) stem did not occur only in the present tense, and it’s exactly the past-tense indicative of this stem, the so-called “imperfect” of PIE *dʰeh₁-, that played a role in the development of the Germanic verb system.

[3] None of them is reduplicated in Modern English, and few strong preterites remained reduplicated even in Proto-Germanic.

19 January 2016

The Root Question: Why *bʰer-?

The verb root *bʰer- has several paradoxical properties. On the one hand, it’s one of the most securely attested Indo-European roots, documented in Tocharian, Armenian, Greek, Phrygian, Albanian, Indo-Iranian, Balto-Slavic, Germanic, Italic and Celtic. On the other hand, it’s conspicuous by its apparent absence from Anatolian, which means that despite its ubiquity in the rest of the family its Proto-Indo-European status is insecure (but see below on possible Anatolian reflexes). The present stem *bʰér-e/o- is a widespread “simple thematic present”, so familiar as a handbook example that the whole class is often referred to as the *bʰéreti-type.[1] Still, several languages (Latin, Greek, Vedic) show traces of an alternative athematic stem without the *-e/o- suffix – probably a so-called “Narten present” with an underlying long vowel: *bʰḗr-ti, *bʰér-n̥ti). Despite being so common, and despite having such a basic meaning as ‘carry, bear’, the verb lacks some conjugational forms in some Indo-European languages, so that *bʰér- has to team up with other roots to form a complete paradigm. In Latin, for example, the present (ferō), the imperfect (ferēbam) and the infinitive (ferre) are derived from *bʰer-, but the perfect tense (tetulī or tulī) and the perfect passive participle (lātus < *tlātos) are provided by the root *telh₂- ‘lift, raise, support the weight of’. In Greek, we again have reflexes of *bʰer- in the present and the imperfect, while most other forms come from *h₁neḱ- ‘take, acquire’ (and the suppletive future oísō does not even have an established etymology). In Slavic, imperfective *bьrati, 1sg. *berǫ ‘take’ is paired with perfective *ęti, *(j)ьmǫ, from PIE *h₁em- (Lat. emō).

Always collecting stuff...
Photo: Jacek Zięba, CC BY-SA 3.0
The meaning of *bʰer- is quite variable. In many branches its reflexes can be glossed as ‘carry, bear’ (of course English bear is a good example), with connotations of movement rather than static support, and of personal physical effort rather than vehicular transport (in the latter case *weǵʰ- ‘cart, convey’ is used). But the root has developed a large number of secondary senses: ‘take, take up, take away, collect, lift, bring, yield, produce, bear offspring, endure’, etc., and in some branches the core meaning has undergone a considerable semantic shift. Thus, Slavic *berǫ means ‘take’, while *nesǫ from the root *h₁neḱ- (originally ‘take, acquire’) has come to mean ‘carry’ (as if the two roots had swapped meanings). Lithuanian also has nèšti (1sg. nešù) for ‘carry’, but the meaning of Lith. ber̃ti, Latv. bḕrti is ‘scatter’ – so distant from ‘carry’ that doubts have been raised as to whether the Baltic words really derive from *bʰer- (though a development like ‘carry/take around’ > ‘circulate, distribute, disperse’ is quite natural, cf. Latin circumferō).

The oldest reconstructible present, *bʰḗr-/*bʰér- probably meant ‘carry’ in a “telic” sense (as an action with an endpoint: ‘bring or remove by carrying’). The verb gave rise to a root agent noun, *bʰṓr ‘one who takes away’ → ‘thief’ (Latin fūr, Greek pʰṓr). The widespread simple thematic *bʰér-e/o-, which probably originated as the “mediopassive” voice of the original present (with self-benefactive or passive senses), basically inherited its semantics but emphasised the durative shade of the verb and its imperfective character (hence the need to employ some other root to express the perfective and stative aspects).

Vedic alone documents a clear contrast between telic *bʰér(-e/o)- (bhárati, also Rigvedic bhárti) and atelic (iterative, habitual) *bʰi-bʰ(é)r- (bíbharti, Rigvedic bibhárti, 3pl. bíbhrati), but given the fact that CV-reduplicated presents are generally a recessive class of stems in Indo-European, reducing rather than enlarging its membership in the historically known languages, we are probably dealing with an archaism rather than a local innovation.[2] In other words, the distinction between *bʰḗr-/*bʰér- and a reduplicated present (indicating, respectively, events with an endpoint and without one) may be at least as old as the Core Indo-European subfamily. It might even be Proto-Indo-European in the strict sense, assuming that the absence of the root *bʰer- from Anatolian is accidental and due to its having been ousted by (near-)synonyms such as Hittite arnuzi ‘brings, sends, delivers’ or pē-dai ‘carries’.

... and piling it up.
Actually, isolated derivatives of *bʰer- may exist also in Anatolian. The Hittite word for ‘small rodent, mouse’, kapart-, has been etymologised as *ko(m)-bʰr̥-t- ‘gatherer, collector’ (of “stolen” grain).[3] There’s also a possible Lydian cognate, kabrdokid ‘steals’, a verb derived from an abstract noun supposedly meaning ‘hoarding away, stealing’.

The notion of collecting, gathering or bringing together often accompanies the use of *bʰer-. Greek pʰóros (from *bʰór-o-) means ‘earnings, tribute’, and one of the meanings of pʰorā́ (*bʰor-áh₂) is ‘crop’. The abstract noun *bʰr̥-tí- (Ved. bhṛtí- ‘carrying, bringing, support, maintenance’) acquires a concrete meaning in Armenian bard ‘pile, sheaf (of corn)’. Assuming hypothetically that the reduplicated iterative present could form a noun like Hitt. mēmal ‘groats’ (the product of grinding), we might expect *bʰé-bʰr̥ (of perhaps collective *bʰé-bʰōr) ‘the effect of continual collecting, a growing pile’. Like, say, a beaver’s construction – a dam or a lodge. The builder or inhabitant of a *bʰébʰ(o)r- would have been a *bʰébʰros (or possibly *bʰibʰrós, or both; the accent in nominals of this type is hard to predict), and an appropriate epithet referring to the same animal’s prominent behaviour – the assiduous collection and transport of building materials to repair, strengthen and enlarge its constructions – would have been *bʰibʰrús (or *bʰebʰrús) ‘one that’s always gathering stuff’ (timber, twigs, mud, etc.). I think the reduplication makes more sense with the root *bʰer- than with any other similar verb that might refer to something that beavers habitually do. The ability to cut down trees, for example, could be expressed by forming a simple agent noun; iterativity would not need to be emphasised. The male beaver’s legendary defensive stratagem – biting off its testicles and throwing them before hunters – would of course be a one-time trick; and “being brown” is not even eventive, let alone iterative.
So much for beavers, and for the topic of Indo-European nouns showing CV-type reduplication. The next post will be about reduplication in verbs.

[REDUPLICATION: back to the table of contents]


[1] The simple thematic presents arose in the Core Indo-European group and are absent from the Anatolian languages, as far as we know. Only a small number are known from Tocharian; *bʰér-e/o- is one of them.

[2] The alternative iterative stem, *bʰor-éje/o-, is attested only in Greek as pʰoréō ‘carry around, wear, possess (a feature)’.

[3] See Lat. conferō ‘bring together, collect’, and compounds like Vedic iṣu-bhṛ́-t- ‘arrow-carrying’ (describing an archer).

18 January 2016

Towards a More Realistic Beaver

When we consider the known patterns of CV reduplication in Indo-European, we find that different reduplicated adjectives or nouns with very similar meanings can be derived in parallel from  the same verb root. One pair already mentioned is Vedic sásni- (a cákri-type word) : siṣṇú-, both from the verb root *senh₂- ‘gain, strive after, accomplish’. Both adjectives mean, approximately, ‘constantly gaining/winning for oneself or others’. No CV-reduplicated present derived from this root is attested. It forms an Indo-Iranian reduplicated perfect (which, however, expresses a completed action and has no iterative or habitual connotations), and a Vedic “intensive” present with full reduplication (which does mean ‘gain/acquire repeatedly’ but is structurally different from the adjectives in question).[1] It is possible, however, that once a productive derivational schema became established, it was not essential that an actual CV-reduplicated present should exist. E(e)-R(ø)-i- or E(i)-R(ø)-u- adjectives, as well as E(e)-R(ø)-o- (*kʷékʷlo-type) nouns could be formed directly on the basis of a verb root. In one of the Rigvedic hymns to Indra (Book 6, 23:4b) the god is described as follows:
babhrír vájram papíḥ sómaṃ dadír gā́ḥ
carrying the vajra, drinking soma, giving cows
(doing all these things habitually, i.e. whenever he comes to attend a soma-pressing). We have no fewer than three cákri-type quasi-partciples here.[2] Note that they take accusative objects, like the corresponding verbs. And yet, although all the three verbs form CV-reduplicated presents in Vedic, the adjectives can’t be derived directly from those presents. The Vedic present of *bʰer- ‘carry’ (3sg./3pl.) is bi-bhár-ti [3]/bí-bhr-ati with an i-reduplication[4]; from *poh₃(i)- ‘drink’ we have pí-b-a-ti/pí-b-a-nti. At least in the latter case both the i-reduplication and the voiced *b (by assimilation, from the sequence *-ph₃-, with a voiced laryngeal) are very old, at least as old as the common ancestor of Vedic, Latin and the Celtic languages.[5] The adjective papí- seems to have been formed directly to the Indo-Aryan root -/-, using the cákri-type template. The type itself is probably an Indo-Iranian innovation (especially productive in Vedic), inspired by the use of *-i- rather than *-o- as the final vowel in compound stems. The precursor of the cákri-type is essentially identical with the *kʷékʷlo-type (except perhaps for an accentual contrast between nouns and adjectives, if the final accent of bhabhrí- is original and the initial one in cákri- is a Vedic innovation). Therefore the formation represented by bhabhrí- is a reworking of an older type which can be reconstructed as *bʰe-bʰr-ó- ‘(ever-)carrying’ – or, when substantivised, *bʰé-bʰr-o- ‘habitual carrier’. A parallel u-stem with practically the same meaning may also have existed, either *bʰi-bʰr-ú- (like Ved. siṣṇú-) or possibly *bʰe-bʰr-ú- (like Ved. (pari-)tatnú- ‘surrounding’).[6] Thus, both the *Ce- ~ *Ci- variation in the echo and the coexistence of stems in *-o- and *-u- can be explained with recourse to known Indo-European word-forming processes.

Two well-known Indo-European semiaquatic mammals
Conrad Gessner, De piscium et aquatilium animantium natura

But wait a moment: *bʰé-bʰr-o- and *bʰi-bʰr-ú- look exactly like the reconstructed variants of the ‘beaver’ word. If beavers owe their Indo-European name not to their coat colour but to some characteristic habitual activity, the verb describing that activity should be similar to *bʰer- ‘carry’. There are, for example, a couple of known roots of the shape *bʰerH-, one meaning ‘cut, strike, pierce, fight’ (with an unspecified laryngeal) and the other ‘move rapidly, rush, chase’ (in which *H = *h₂ or *h₃). The laryngeal would have been lost in a reduplication containing the root in zero-grade, so we would not be able  to see any difference between the outcomes of *-bʰr- and *-bʰrH-.

Stretching the imagination a little, one would be able to connect the meaning of any of these roots with the beaver’s habits. For example, the first *bʰerH- is glossed ‘mit sharfem Werkzeug bearbeiten’[7] in the LIV; and what are the beaver’s incisors if not “ein sharfes Werkzeug”? Still, I would like to defend the simplest solution, involving the most widespread and most securely reconstructed of these roots, namely *bʰer- ‘carry’. I will justify my preference in the next post. Here, let me only point out that no matter which root we choose, it makes sense to assume that there were more than one related but independently formed variants of the beaver’s name already at a very early stage – at least *bʰébʰros and *bʰibʰrús. It seems that both of them were inherited by languages ancestral to some of the branches of Indo-European. Their visible relatedness, and perhaps the existence in some branches of recognisably related reduplicated verb forms could have produced still more variants through a kind of lexical cross-pollination, hence the attested variation of the echo vowel, the stem class, and the accentuation.


[1] Reduplication in verbs will be discussed in blog posts to come.

[2] They are accented on the stem vowel, unlike cákri- itself, but the accentual variation looks random and is not correlated with any functional difference.

[3] With the root syllable accented in the Rigveda. Later the accent was shifted to the echo syllable: bhíbharti.

[4] When not reduplicated, the Vedic present (bhárati) usually has a telic meaning, i.e. ‘bring’ (a complete one-time activity) rather than ‘carry, bear, wield’.

[5] The original forms were *pí-ph₃-e-ti/*pí-ph₃-o-nti, with the second *p realised as [b].

[6] Cf. Germanic *tetru-, *tetru-ka- (or *titru-ka-?) ‘skin disease, scabies’ (OE teter, Mod.E tetter, OHG zitaroh), Sanskrit dadru-, dadrū (f.) ‘leprosy’, apparently from *der- ‘tear, flay, peel’.

[7] That is, ‘work on (something) with a sharp tool’ – a bit conjecturally, to be sure, since most of the attested meanings suggest the use of a weapon rather than a carpenter’s tool, or are figurative: ‘scold, rebuke’, etc.

12 January 2016

Enter the Beaver

Beaver rampant
Arms of Biberach an der Riß

Arthur Charles Fox-Davies
A Complete Guide to Heraldry (1909)
Wikimedia Commons
Most etymological dictionaries, introductions to Indo-European studies, as well as online sources (including Wikipedia and Wiktionary) inform the reader that the Proto-Indo-European word for ‘beaver’, *bʰébʰrus, is a reduplicate derivative of the root *bʰer- or *bʰreu-, meaning ‘brown’. The same root is often claimed to account for the Germanic ‘bear’ word, *βer-an- (a nasal stem), as if from *bʰer-on- ‘the brown one’. There are several problems with these etymologies.

To begin with, neither *bʰer- nor *bʰreu- is attested as a stem. At best, there are several words in different Indo-European languages which contain reflexes of * and *r (and sometimes of *u) and mean something like ‘brown’; it is, however, hard to connect them formally within a plausible etymon. We can agree that Modern English brown, Modern German braun and Modern French brun (borrowed from Frankish) are “basic colour terms” and can be used to describe the colour of a beaver’s coat. It doesn’t follow, however, that the same can be claimed of their Proto-Germanic ancestor, *βrūna-. In early Germanic languages the word meant ‘dark, swarthy, dusky’ (as well as ‘shiny, bright’, often with reference to forged metal or the sea), and while it could be used to modify virtually any hue for which there was a name, it was hardly a specific colour term itself. Its extra-Germanic connections are anything but secure: although Greek pʰrū́nē (f.), pʰrũnos (m.) ‘toad’ might or might not be cognate, there is no related Greek colour adjective. The “colour conspiracy” of the modern languages of Europe, which have developed identical or very similar basic colour systems, is a case of recent cultural convergence. As late as the seventeenth century, German braun could still refer to hues in the violet/purple range (e.g. the colour of the amethyst).

Modern version of the same
(we know so much more about beavers today).
Lithuanian bė́ras does refer to shades of brown, but is used as a specialised horse-coat term (like English bay), not a generally applicable colour word, and can’t be directly connected with *βrūna- anyway. Vedic babhrú- means ‘deep brown, reddish-brown’ and is practically identical with the reconstructed ‘beaver’ word, but it is probably derived from the animal’s name, not vice versa. The ancient Indo-Aryans had migrated too far from the geographical range of the beaver to have retained the original meaning, but they did keep the derived descriptive adjective.[1] Secondarily substantivised, babhrú- may refer to several rather different animals of India, from the brown mongoose to the Jacobin cuckoo.

The ‘bear’ connection is dubious too. A “weak” (n-stem) noun would presuppose an adjective like *bʰer(o)-, not recoverable as a Proto-Indo-European colour term (even the isolated East Baltic adjective mentioned above isn’t a perfect match), and there is an attractive alternative: the *βer- part can be derived either directly from the root noun *ǵʰwēr-/*ǵʰwer- ‘wild animal, beast’ (Ringe 2006: 106) or more plausibly from the corresponding thematic adjective ‘wild, savage’ (cf. Lat. ferus). To be sure, the hypothesis that word-initial *gʷʰ and *ǵʰw yield Germanic *β remains somewhat controversial (there are a small number of examples), but the etymology of bear as ‘the ferocious one’ is semantically unassailable. The substantivisation of an adjective by turning it into an n-stem is a common morphological process.

Instead of trying to guess in advance what the *-bʰr- part of the beaver’s name stands for, let’s have a look at the full reconstruction first. It’s usually cited as a stem in *-u-, perhaps primarily because of the Sanskrit ‘deep brown’ word, but the total Indo-European evidence is indecisive:

  • In Slavic *bobrъ, *bebrъ, *bьbrъ (note the variation of the echo vowel)[2] the final *-ъ may reflect *-o-s or *-u-s. Some old derivatives and toponyms plus accentual considerations suggest that the word was originally a u-stem in Slavic or perhaps vacillated between the two types, for there’s some evidence supporting an o-stem as well.
  • Baltic shows both u-stem and o-stem forms – the former in Old Prussian bebrus and in the Lithuanian variant bebrùs, the latter in Lith. bẽbras, bãbras, and Latvian bȩbrs.
  • Iranian has an o-stem reflex: Proto-Iranian *babra- > Younger Avestan baβra-, with the variant baβri-; cf. also Pahlavi babrag < *babraka-, with the very productive “colloquialising” suffix *-ka-.
  • Latin has fiber (second declension), as if from *bʰibʰro- (with an i-echo), beside sparsely attested feber.[3]
  • In Celtic, the inherited ‘beaver’ word has been buried under layers of lexical innovations (especially *abankos ‘river animal’) and borrowings. It can be detected in some Gaulish, Old Brittonic and Old Irish toponyms, ethnonyms and personal names, but its exact Proto-Celtic form is difficult to recover: *bebro-, *bebru-, *bibro- and *bibru- possibly coexisted in early Celtic.[4]
  • Finally, the word is excellently preserved in Northwest Germanic. [5] We have e.g. Early Old English bebr, bebir, beber, later befer, befor, beofor; Old High German bibar, bibur; and Old Icelandic bjórr < *bjǫβurr < *beβ(u)raz. All these forms can in principle reflect Proto-Germanic *βeβraz < *bʰebʰro-, though a u-stem can’t be completely ruled out. [Afterthought]

The ‘beaver’ word has a relatively wide attestation, but since the animal itself has occurred mainly at northerly attitudes in historical times, it’s poorly attested in Indo-Iranian and Italic, and not at all in Armenian or in Greek (where we find kástōr instead, borrowed also into modern Albanian). Alas, although beavers lived in parts of ancient Anatolia, we don’t know what the speakers of Hittite or Luwian called them: they weren’t thoughful enough to write something about beavers for posterity. The Germanic and Balto-Slavic languages have preserved the word best, and it’s in Balto-Slavic that we find the greatest diversity of variants.[6] What shall we make of this variety?

I will try to answer this question in the next blog.

[REDUPLICATION: back to the table of contents]


[1] Cf. also Hurrian babrunnu, a technical horsey adjective borrowed from the language of the “Mitanni Indo-Aryans”.

[2] The echo vowel in the modern Slavic languages most often reflects *o (found in all Slavic languages today). The minor variant with *e has a wide but scattered distribution (Serbian Church Slavonic, dialectal Bulgarian, Slovene, Upper Sorbian, Old Russian) and looks like a locally surviving relic (see also the Polish river-name Biebrza, and Romanian breb, borrowed from Slavic). The modern prevalence of *o may be due to a Slavic tendency (inconsistent and poorly understood) to introduce and generalise *o in CV-reduplications. Borrowing is less likely, though Iranian influence has been suspected (as an indication of prehistoric trade in beaverskins and castoreum). Western Lithuanian bãbras seems to be Slavic-influenced. The variant *bьbrъ is rare (Old Russian, Serbo-Croatian dȁbar, with a dissimilated initial stop). It could be regarded as an aberrant local innovation, were it not for the fact that (unlike *bobrъ) it has several exact counterparts in other branches (West Baltic hydronymic *bibru-, Lat. fiber, Celtic *bibru- ~ *bibro-).

[3] Replaced by loanwords (some related to it) in Vulgar Latin and Proto-Romance.

[4] It seems that beavers never colonised Ireland after the last ice age, which of course does not mean that the Irish Celts were unaware of their existence. “Beavery” tribal names could also have been brought to Ireland from Great Britain and/or the continent during prehistoric migrations.

[5] Its absence from the Gothic corpus is due to the usual reason: no beavers in the Bible.

[6] Some East Slavic dialects preserve a uniquely specialised, evidently archaic word for ‘beaver lodge’, *zer(d)mę < *gʰerdʰ-mn̥, with a curious “hyper-satem” treatment of the root *gʰerdʰ- ‘gird, encircle, fence about’, cf. Slavic *gordъ ‘fort, town’, Lith. gar̃das ‘enclosure, stall’, Vedic gr̥há- ‘house’, Albanian gardh ‘fence’. The ‘beaver lodge’ word has been borrowed into standard Polish as żeremie (with a hypercorrect ż).

08 January 2016

Setting the Scene for the Beaver

In the Proto-Indo-European derivational system, adjectives in *-ó- were rarely formed directly from root nouns in zero-grade (often simply identical with verb roots). Thus, they differed from some other adjectival derivatives involving more complex suffixes, e.g. deverbal adjectives in *--, *--, *--. When just the bare thematic vowel *-ó- was added, the zero-grade was usually “reinforced”  – in the simplest case, by inserting a full vowel, *e, somewhere inside the root (not necessarily in the “correct” place, that is, not always faitfully restoring the original e-grade). This, however, did not have to happen if the root was the second member of a compound. Since reduplications behave in many respects  like compounds (namely, like a root compounded with itself), it is possible that nouns of the *kʷékʷlo-type should be traced back to adjectives like *kʷe-kʷl-ó- ‘revolving’, and these in turn to reduplicated root nouns like *kʷé-kʷ(o)lh₁-, expressing the action itself or its product (in this case, either ‘circular movement’ or ‘circle, cycle’).

In fact, such nouns are not purely conjectural. A few have left tangible reflexes in historically known languages. For example, Hittite mēmal ‘groats’ is an athematic neuter noun derived from *melh₂- ‘grind’ by means of CV-reduplication: *mé-ml̥h₂. In theory, a *kʷékʷlo-type noun could easily be formed via thematicisation and accent retraction: *mé-ml-o- ‘something used in groat production’ (e.g. ‘quern, millstone’); it just happens not to be attested (unless Armenian mamul ‘press’ is somehow derivable from it). As for reduplicated adjectives, Vedic examples such as vavrá- ‘hiding, concealing oneself’ and sasrá- ‘streaming’ can be quoted (the roots in question are, respectively, *wer- ‘cover, protect’ and *ser- ‘flow’).

There are, however, other reduplication types, also based on verb roots but harder to fit into the pattern proposed above. Superficially, they have the same structure: E(V₁)-R(ø)-V₂-, where R(ø) is a verb root in zero-grade. However, V₁ is *i rather than *e, or V₂ is a high vowel (*i or *u) rather than *o; note only that V₁ and V₂ can’t both be *i at the same time. Here are a few characteristic examples from Vedic (where such reduplications are particularly well represented):

Vedic word
PIE root
vavrí- ‘hiding-place’
*wer- ‘cover’
cákri- ‘active, making’
*kʷer- ‘cut, shape’
babhrí- ‘carrying’
*bʰer- ‘carry’
sásni- ‘gaining repeatedly’
*senh₂- ‘gain’
siṣṇú- ‘ever-securing’
*senh₂- ‘gain’
jigyú- ‘victorious’
*gʷei- ‘compel’
(pari-)tatnú- ‘surrounding’
*tenh₂- ‘stretch’

Agni (with partial reduplication)
Most of these virtual “protoforms” are not likely to be of Proto-Indo-European date; they only illustrate the operation of the derivational mechanism. Indo-European i-stems were typically nouns (often with an agentive meaning) derived from o-stem adjectives. There were also adjectival compounds in which the second member was an i-stem corresponding to a thematic noun (with *-o- replaced by *-i- ).[1] Both processes seem to have affected some of the reduplications above. On the one hand, we have vavrá- (adj.) : vavrí- (noun); on the other, cákri- (adjective) looks as if it had originally corresponded to a noun of of the *kʷékʷlo-type, and acquired its *-i- by conforming to the productive pattern of compound adjectives (as pointed out above, reduplications are compound-like structures).

But what about adjectives like jigyú- ‘victorious’? In their case, derivation from a *kʷékʷlo-type noun does not seem to be possible. Note that we have an adjectival doublet, sásni- ~ siṣṇú- [2], both derived from the same, widely distributed Proto-Indo-European root, but apparently in different ways.

As opposed to “second generation” adjectives with stems ending in *-i-, u-stem adjectives are a very old type. Some of them can be found on any list of basic Proto-Indo-European vocabulary. In some cases the root to which the *-u- is added is simply adjectival (meaning that it has no other known functions); but it may also be a recognisable verb root. In the last common ancestor of the Indo-European languages the root normally had zero-grade, and the suffix was accented: R(ø)-ú-. Here are a few typical examples: *tn̥h₂-ú- ‘thin’, *pl̥h₁-ú- ‘much, many’, *h₁s-ú- ‘good’, *mr̥ǵʰ-ú- ‘short, brief’, *gʷr̥h₂-ú- ‘heavy’, *h₁ln̥g(ʷ)ʰ-ú- ‘light, nimble, quick’. In terms of function, this *-ú- is almost equivalent to the suffix *--, also found in many common adjectives (and often transparently deverbal), e.g. *h₁rudʰ-ró- ‘red’, *h₂r̥ǵ-ró- ‘flashing, swift’, etc. There are even occasional pairs of (near-)synonyms: *h₁ln̥g(ʷ)ʰ-ú- ≈ *h₁ln̥gʷʰ-ró- (from the verb root *h₁lengʷʰ- ‘move briskly’).[3] One important difference between the two types is that *-- adjectives do not occur in old compounds. We may therefore presume that if a “first generation” deverbal adjectve was formed from a reduplicated verb, *-- was ruled out  and *-ú- was the remaining option.

The frequent occurrence of *-i- in the echo syllable of u-stem reduplications may have something to do with the fact that *-ú- is normally added to an ablauting base in zero-grade. Perhaps *-i- was once treated as a weak allomorph of full-grade *-e-.[4] The recipe for a reduplicated u-stem adjective is therefore as follows: take a root (e.g. *senh₂-), reduplicate it using a CV template (*se-senh₂-), make it weak (*si-sn-), and add *-ú- (*sisnú-). Serve in a Vedic hymn to Agni the Bounteous (siṣṇú-).

In an earlier article (2007), I analyse the aberrant verb *gʷíh₃w-e/o- ‘live’ and the related adjective *gʷih₃w-ó- ‘living, alive’ as ancient reduplications: *gʷi-h₃w-ó-(from pre-PIE *gʷi-gʷw-ó-) has retained an archaic weak vowel of the echo because its reduplicative structure became obscured very early and protected from any kind of analogical “repair”. Redupilcations in *-ú- are similar to those in *-ó-. However, u-stems are more likely to retain their adjectival character, while o-stems can easily be substantivised by means of accent retraction (so that “second generation” cákri-type adjectives must sometimes be generated to replace their lost thematic ancestors.

I am awfully sorry if the discussion above seems too technical, but I shall need to refer to this formal background when presenting the hero of the next post (to appear during the weekend) – the Proto-Indo-European word for ‘beaver’. I was actually planning to deal with beavers today, but I realised that some complicated stuff had better be clarified beforehand.

[REDUPLICATION: back to the table of contents]


[1] Cf. Lat. inermis < *n̥-h₂armi- ‘unarmed’ (literally ‘[having] no-weapon’) vs. arma ‘arms’ (an o-stem plural).

[2] To be sure, siṣṇú- occurs only once in the Rigveda (Book 8, 19:31) as an epithet of Agni.

[3] Gk. elakʰús ‘small’, elapʰrós ‘light, quick, small’. Note that labiovelar stops regularly lost their labial component before *u/*w already in Proto-Indo-European.

[4] Cf. the realisation of unstressed etymological /e/ as a high vowel [ᵻ] in many English words, including obscured compounds (as in the traditional pronunciation of forehead, to rhyme with horrid).

05 January 2016

A Reduplication Manual for Drivers, Metalworkers, and Birdwatchers

The case of *kʷékʷlo- is relatively clear, perhaps because the word is not as old as some other Indo-European vocabulary, and its derivation from a verb root is at least semi-transparent. The same goes for several other reduplications with the same structure (echo + known root in zero-grade + thematic vowel). Unlike *kʷékʷlo-, they are less securely attested, appearing only in one or two branches of Indo-European. For example, Latin aurum ‘gold’ has a nice Baltic cognate, Lith áuksas (dialectal áusas).  A careful comparison of these words, taking into acount the Baltic tonal accent, leads to the reconstruction *h₂áh₂uso- (= *h₂é-h₂us-o-, originally neuter, as in Latin) for the common ancestor of Italic and Baltic.[1] The verb root here is *h₂wes- ‘light up, dawn’. Gold is therefore “that which glitters”, and we can speculate that the reduplication symbolically suggests the intermittent flickering of reflected light.

Another interesting example, this time restricted to Baltic and Slavic, is the word for ‘cuckoo’: Lith. gegužė̃, Old Russian zegzicažеgъzuľa (< Proto-Slavic *žеgъzа, extended with various suffixes).[2] The ancestral Balto-Slavic form can be reconstructed as *geguźaH. Because of its tongue-twisting, repetitious consonant pattern the word is often “explained” as onomatopoeic, though it bears little actual resemblance to any noise characteristic of the common cuckoo. If, however, we try to analyse it as a *kʷékʷlo-type reduplication and travel back in time beyond Proto-Balto-Slavic , its structure becomes clearer: *gʰegʰuǵʰah₂ (= *gʰe-gʰuǵʰ-e-h₂), a feminine noun based on the verb root *gʰeuǵʰ- ‘hide’ (known from Indo-Iranian and Baltic). I’ll leave open the question whether the Balto-Slavs dubbed the cuckoo “the hiding one” because it is notorious for hiding its eggs in other birds’ nests, or because it’s frequently heard but almost never seen:

          No bird, but an invisible thing,
          A voice, a mystery. [3]

Both explanations make enough sense to justify the etymology.[4]

A strange reduplication hiding in a reed warbler’s nest.
Let us use the following shorthand notation: R is a root morpheme and E is its reduplicative echo (filling a CV template). The vocalism of a morpheme will be shown in brackets: R(ø) means a root in zero-grade (without a vowel), E(e) means an echo in e-grade, and E(é) means that the e-grade echo is accented. A *kʷékʷlo-type derivative can be defined as follows: E(é)-R(ø)-o-. The accent was shifted to the thematic ending when an o-stem formed a collective, and it’s possible that the vowel of the echo was phonetically weakened as a result, yielding E(ə)-R(ø)-é-h₂,  but this was a superficial change, easy to reverse by analogically restoring the full vowel of the singular. Note that while ‘wheels’ often occur as a specific set, ‘gold’ is an uncountable mass noun and ‘cuckoos’ do not form natural assemblies. Therefore, of the reduplicated words discussed so far, only ‘wheel’ had a frequently used collective form.

The “thematic” suffix *-e/o- was very productive in the formation of adjectival derivatives. Proto-Indo-European adjectives had the same patterns of declension, and adjectives could be substantivised (converted into nouns) by so-called “internal derivation”: not by suffixation, but by modifying the accent or vocalism. Many thematic nouns must have originated as adjectives turned into nouns simply by retracting  their accent from the thematic vowel. If this happened after the period of dramatic vowel changes – strongly sensitive to the location of stress – which produced the Indo-European ablaut patterns, the accent retraction did not affect the vocalism of the adjective. Some nouns retracted their accent just because they were nouns, even if thy had no adjectival counterpart.[5] When such a derivational pattern became productive, reduplicated thematic nouns could be formed from verb roots directly, skipping the adjectival stage, but the meaning of the noun was still “descriptive”, referring to a characteristic action carried out in an iterative or frequentative manner (“turning round and round”, “flashing again or again”, “hiding habitually”).

Since we have examples of E(é)-R(ø)-o- from both Anatolian and Core Indo-European, the type must have originated already in Proto-Indo-European; but given the limited distribution of most of the nouns formed in this way, they were probably coined at the “dialectal” stage, when the descendants of Proto-Indo-European were already diverging into distinct languages. Younger derivatives are typically more transparent and less irregular than old ones – which is precisely what makes their derivation productive. In the next post, I shall try to argue that an older layer of reduplicated nouns, less transparent and harder to analyse, can also be identified.

[REDUPLICATION: back to the table of contents]


[1] Michiel C. Driessen. 2003. *h₂é-h₂us-o-, the Proto-Indo-European term for “gold”. Journal of Indo-European Studies 31, 347–362.

[2] The accumulation of stops and fricatives makes the word prone to phonological distortion, so the ancestral sequence ž…gz may change into ž…gžz…gzz…z or gž…gž by assimilation at a distance. The Modern Polish word for ‘cuckoo’ is kukułka (an onomatopoeic innovation with a different kind of reduplication), but conservative forms, such as zazula and gżegżółka, have survived in regional dialects. Most Poles remember the funny-looking word gżegżółka (pronounced [gʐɛˈgʐuwka]) from the classroom spelling-tests they were tortured with in their school days. They may dimly recall that it referred to some sort of bird. Schoolteachers, however, rarely explain which bird it is – they probably aren’t sure themselves. Gżegżółka has therefore become  an interesting example of a word which has no real communicative function but has been co-opted for a marginal social use (testing schoolchildren’s orthographic memory). See zyzzyva for a similar phenomenon.

[3] William Wordsworth, To the Cuckoo.

[4] The idea is mine, but you are free to share it as long as you credit your source. It has the additional advantage of making it possible to connect Balto-Slavic *geguźaH with the Proto-Germanic word for ‘cuckoo’, *ɣaukaz, on solid formal grounds. The similarity between them has been noticed before, but mere similarity means little without a detailed morphological analysis.

[5] Note that many PIE thematic nouns were accented on a zero-grade syllable, e.g. *h₂ŕ̥tḱos ‘bear’ or *wĺ̥kʷos ‘wolf’.