30 May 2013

The Water Story

When during the First World War the Czech orientalist Bedřich Hrozný was copying cuneiform inscriptions from the Hittite royal archive, deposited at the Imperial Ottoman Museum in Constantinople, it suddenly dawned on him that the still enigmatic language was Indo-European. One of the first words that he was able to interpret was wa-a-tar (wātar) ‘water’. Hrozný already knew that the preceding clause meant something like ‘and you will eat bread...’, so ‘drink water’ certainly made sense as a continuation. Of course even the occurrence of a familiar-looking word in the right context doesn’t mean much by itself, but the newly excavated Hittite corpus was sizeable and Hrozný was soon able to understand large fragments of the texts and indentify (not always correctly) more Indo-European material in them – from pronouns and sentence particles to verbs, nouns and adjectives.

The similarity of wātar to words for ‘water’ in other branches of IE is not accidental, and the word is inherited from a common ancestor rather than borrowed. We can say so with confidence not simply because the sound correspondences look fine. The ‘water’ word is declined in Hittite, with inflectional endings familiar from elsewhere. What’s more, the declension of wātar is irregular in an interesting way: the stem has the variant witen- in the oblique cases (such as the gen.sg. witenas), and its nom./acc.pl. is witār. Those Hittite alternations can be traced back to a reconstructed pattern like *wódr̥, *wedén-os, *wedṓr – with vowel substitutions, accent shifts, and a characteristic *r/n alternation in the suffix, found also in neuter stems in other morphologically conservative IE languages.

Hittite preserves a unique variety of stem variants in one paradigm. Other IE languages have levelled them out at least partly:
  • Greek has húdōr, gen.sg. húdatos, nom./acc.pl. húdata. The a of the suffix in the oblique cases and in the plural reflects a pre-Greek syllabic nasal (*ud-n̥-t-os, *ud-n̥-t-ah₂, with an extra -t- that is a Greek innovation), which means that the *r/n alternation is indirectly reflected there, but the root syllable has a fixed shape (the full vowel *e/o was deleted, leaving */wd-/ = *ud-); also the accent is fixed on the initial syllable.
  • In Vedic, only a few isolated case forms of the word survived (loc.sg. udán ~ udáni, gen.sg. udnás, nom./acc.pl. udā́), with alternations restricted to the suffix, as in Greek, but with the word accent anywhere but on the root, quite unlike Greek.
  • In Germanic, the root syllable has the same full vowel throughout (*wat-, reflecting older *wod-); the *r/n alternation is still visible, but the variants with *r and *n are segregated among different Germanic languages (cf. Old English wæter vs. ON vatn, both remodelled as vowel-final stems: *wat-r-a- vs. *wat-n-a-). Gothic, in which the stem remained consonantal, generalised the nasal variant at the expense of *r: nom.sg. wato, gen.sg. watins, dat.pl. watnam (as if from pre-Gmc. *wod-ōn, *wod-en-, *wod-n-).
  • In Baltic the suffix has a nasal, but there is also another nasal, curiously infixed in the root, presumably due to the generalisation of anticipated nasality: *wod-n-/*ud-n- > vand-/und-, as in Lithuanian vanduõ, acc.sg. vándenį, Latvian ûdens, Old Prussian wundan, unds.
  • Some of the other IE languages also preserve traces of the noun (Slavic *voda, Umbrian utur, abl.sg. une, etc.), and numerous words derived from the stem *w(V)d-(V)n/r- appear even in those languages in which the primary noun has been lost, cf. Latin unda ‘wave’ < *ud-n-ah₂ (with a metathesis common in Latin and convergent with what we see in the declension of the Baltic ‘water’ word’).
Otter < OE oter < PGmc. *utraz < *ud-r-o-s
It’s a nice jumble of forms, not even quite compatible across related languages because of the independent fixation of different innovations along different branches of the family tree. It took the efforts and accumulated insight of several generations of Indo-Europeanists (culminating in the work of late 20th-century scholars such as Jochem Schindler) to explain their complicated evolution in detail. 

The hypothetical common starting point is an “acrostatic” neuter noun with an *o/e alternation (see here for a similar case): nom./acc. sg. *wód-r̥, oblique *wéd-n-, collective pl. *wéd-ōr (from a still earlier *wéd-or-h₂, where *h₂ was a collective ending, lost already in PIE after a stem-final liquid or nasal but causing the compensatory lengthening of the vowel of the stem-forming suffix). The *e in the root syllable was the “weak” counterpart of the “strong” grade *o. But PIE *e was ambiguous, because it could also represent the strong grade of some roots, whose weak variants lacked the vowel. On the analogy of such roots, new weak stems were created, with the *e deleted and the accent shifted to another syllable: collective *udṓr, oblique *udén- (especially in the loc.sg.) or *udn- (followed by an accented inflectional ending). The collective plural (‘waters’ = ‘a vast quantity of water’) was occasionally reinterpreted as a singular mass noun. Its declension was then remade as follows: nom./acc.sg. *wédōr or *udṓr, gen.sg. *udn-és.

Such analogical remodelling must have taken place already in the common ancestor of all the IE languages, and was continued after the breakup of Indo-European unity. The state of affairs visible in Hittite is archaic, but only in a relative sense. In the ancestor of Hittite the noun became accentually mobile – for example, the old gen.sg. *wéd-n̥-s  was replaced by *wed-én-(o)s on the analogy of nouns with a shifting accent – but no new weak grade was generated in the process.

acrostatic → mobile
collective → singular
nom./acc. coll.

The emergence of new variant paradigms is schematically shown in the table above. The forms on the left are the oldest ones; those in the last column illustrate some post-PIE developments (as  reflected e.g. in Germanic). It is important to realise that there must have been considerable variation (rather than a single paradigm) already in the most recent common ancestor of the known IE languages. That variation supplied the raw material for later developments, which could be compared to independent attempts to assemble a new vase from the scattered fragments of several broken ones. Alternative PIE paradigms, each of them too complex to survive in the long run, were mixed up, reorganised, and independently simplified in the daughter languages. We understand the process rather well because the ‘water’ word fits into a more general pattern together with other words of a similar structure, and their evolution is part of a still grander model of inflection and phonological alternation in PIE nouns. Despite its complexity it’s not an arbitrary just-so story but a coherent and well-constrained theory explaining a large segment of PIE grammar. We don’t know everything about its prehistory. For example, we are still very much in the dark about the origin of the *o/e alternation in acrostatic roots. We take the left-hand column of the table as the point of departure because it represents the earliest stage we can safely reach given our current understanding of Proto-Indo-European.

To sum up, the fact that Hittite wātar is similar to English water is interesting but not particularly impressive as an isolated observation. Similarities can be found between any languages chosen at random. It’s far more significant that the inflectional pattern visible in Hittite helps us to understand the origin of the diversity displayed by cognate ‘water’ words elsewhere in the IE family and is part of the evidence used in the reconstruction of the PIE morphological system. It’s those pervasive shared patterns that demonstrate the membership of Hittite in the IE family.

But wait a minute... I promised to discuss the global etymon ʔAQ’WA, right? Why am I talking of PIE *wódr̥ instead? Well, because it’s the best-attested IE word for ‘water’, supported by a wide array of comparative evidence. Anyone trying to establish a genetic relationship between IE and other language families had better keep this in mind. But surely there are other ‘water’ words in IE that are possible candidates for PIE status and could be of interest to long-rangers? Perhaps, but they’ll be discussed in a separate post. We are taking a roundabout route to ʔAQ’WA, but we'll eventually get there.

[► Back to the beginning of the Proto-World thread]


  1. I have an example of the exact opposite to the careful process and pattern of actual evidence you are talking about here. The is the kind of thing you are criticizing.

    New proto-world etymon! English 'salad" and Lushootseed 'sʔəɬəd" "food". Pretty undeniable, right?

    Well no. 'sʔəɬəd' consists of a root 'ʔəɬ' "eaten" (all Lushootseed roots are monovalent) and the transitivizing suffix 'əd' and the 's' is a nominalizer. So it means "what [we] eat."

    Whereas 'salad' consists of a root for salt with a stativizing suffix.


  2. Hi Piotr,

    Outside of Hittite, what other testimony is there for nom./acc. pl. *wedo:r?

    Or, is there evidence for a more general pattern within heteroclitic stems of nom./acc. sg. *C1oC2r. versus n.a.pl. *C1eC2o:r?


    1. Hittite is the only witness here, but Schindler's acrostatic paradigms have in general left few and scattered direct reflexes. The reconstructed form is necessary to make sese of *wod-r/n- (as attested also in Balto-Slavic and Germanic) vs. *ud-ṓr/-n-. We know that collective formation tended to be accompanied by a forward stress shift (also in thematic stems, irrespective of gender, cf. *kʷékʷlos vs. coll. *kʷəkʷláh₂. The o-vocalism of *ud-ṓr is not typical of hysterokinetic stems (except for those that are known to have had acrostatic/amphikinetic variants), but makes sense as the realisation of an originally reduced post-tonic vowel lengthened by Szemérenyi's Law. For these and other reasons the form *wéd-ōr fills a gap in the logical pattern: Schindler's theory accounts for its vocalism and its relationship to *wódr̥, and on the other hand it is a plausible source of *ud-ṓr. We even have the "transitional form" between them, *wed-ṓr (the Hittite collective witār was accented on the second syllable).

    2. It should be "Szemerényi's Law". A bad kind of typo to make when one is writing about accentuation.

  3. You mentioned, though, that some languages had a n.a. sg. form based on the originally plural *wedo:r -- were you referring here to examples like Gr. húdo:r, or are there attested (non-Hittite) forms that show e-grade rather than zero/o-grade?

    (And, to ask what might be an elementary question, is it completely unproblematic to identify the i-vowel of witenas, wita:r with earlier *e?)

    The reason that I'm curious about this particular form (*wedo:r) is that it seems to be a key component of the argument for an IE origin for words such as Finnish vesi, etc. I'm skeptical of this theory because of (among other reasons) the seeming sparseness of any e-grade forms in the reflexes of *wodr.: these reflexes seem to tend overwhelmingly in favor of zero- and o-grade.

    1. PS. -- I'm sorry to digress from the original post topic. This is just something I've been curious about for quite a while.

    2. Few things are unproblematic as regards Hittite spelling and its phonetic interpretation. In witār and the oblique case-forms, the initial syllable is variably spelt we- or wi- (ú-e/i-t/da-a-ar). The mainstream interpretation of the variability is that the /e/ was phonetically raised when unstressed (cf. kessar 'hand', gen.sg. kissaras, from pre-Anatolian *ǵʰésōr, *ǵʰesrós). There are dissenting voices, especially in Leiden; Alvin Kloekhorst argues that the e/i spelling stands for an epenthetic vowel inserted in older *ud-ṓr. I am not convinced that we gain anything by postulating a type of proterokinetic pattern with *o and explaining away the evidence for *e. Hittite has other spectacular examples of archaic ablaut structures not found elsewhere, cf. tēkan 'earth', gen.sg. taknās, from amphikinetic *dʰéǵʰōm/*dʰǵʰm-ós (cf. Gk. khthṓn, Ved. kṣā́s). I'd say that if Uralic *weti is a loan from IE, it must be very old indeed (from some unusually conservative source if not PIE itself). Accidental similarity or some deeper genetic or areal connection look like serious alternatives in this case.

    3. PS. In Slavic we have *vědro 'bucket, pail' *wed-r-o- (the vowel lengthening is due to Winter's Law in Balto-Slavic), but this is clearly a vr̥ddhied derivatve with a secondary full grade. The Uralic form, if borrowed, can hardly reflect the original o-grade nom./acc.sg. (I'm not sure about the collective, since I have no idea how speakers of Proito-Uralic would have treated unstressed *-ṓr). Perhaps the old acrostatic locative *wédn(i) could have been the source?

      There are also traces of a neuter s-stem *wedes-, and there's an isolated i-stem, vaiδi- 'watercourse' (as if from *wedi-), in Young Avestan. The latter would be the ideal prototype for the Uralic word, but it doesn't mean 'water' as a substance and is only marginally attested.

    4. There is no reason to prefer an i-stem as an original for the Uralic word, though. Proto-Uralic final *-i does not signify [i], it simply signifies a non-open vowel contrasting with open *a/*ä. There is still no real consensus on how this was realized (I prefer to write *ə) but at any rate it's been the catch-all substitute vowel for all non-open unstressed vowels in loanwords.

      A bigger disconnect is the absense of any sign of the n/r "suffix" seen in IE, as CVCVC(V) roots were definitely tolerated in Proto-Uralic. This including basic nouns such as *śüðäm (or *śüðämə?) "heart", *šiŋərə "mouse" (though these seem to have always contained a sonorant as the "suffix" consonant, so an IE *s-stem being reflected as an Uralic vocalic stem could work). The issue comes up in some other PIE → (pre)PU loan proposals such as "name" as well.