26 May 2013

Water, Water Everywhere: Back to Global Etymologies


The Eurasiatic interlude was longer than I had originally planned. It’s time to return to Proto-World and “global etymologies”. Few things are more instructive than a nicely dissected example, so I shall compare different approaches to analysing genetic relationships and illustrate them with real data.

No matter how severely we criticise the long-range reconstructions of Nostratic/Eurasiatic, they are proposed by scholars who respect the standard comparative method and appreciate its importance for separating signal from noise. According to the mainstream approach, it is not enough to observe that numerous pairs of words across two languages are similar in form and meaning. One ought to analyse the similarities carefully in order to decide whether they are more likely the consequence of common ancestry than of non-genetic factors such as horizontal diffusion (borrowing), functional convergence (onomatopoeia, etc.), or blind chance. Attempts to meet the accepted standards in inter-family comparison may fail, but at least there are people courageous enough to accept the challenge.

M. C. Escher, Rippled surface (1950)
But there is also a different approach, called multilateral comparison (a.k.a. mass comparison), according to which genetic relationships can be (and indeed have always been) established without assembling regular sound correspondences and reconstructions. To classify a set of languages (the larger the better) one only needs a collection of tabulated data (a list of basic vocabulary and grammatical morphemes for each language will suffice), a good eye for spotting patterns, and some general linguistic training (as opposed to the expert knowledge of some of the languages being compared). It doesn’t really matter if the evidence is partly corrupt or incomplete: as long as there’s plenty of it, its cumulative weight makes errors cancel out. Finding lexical matches across a large number of languages requires no analytic skills or painstaking detective work: enough evidence leaps out at you from the printed page as you eyeball it. Classificatory conclusions can be drawn simply from inspecting the data, with a confidence approaching certainty.

The best-known advocate of multilateral comparison was Joseph H. Greenberg (1915-2001), who used it famously to classify all the languages of Africa into four genetic stocks, and then to hypothesise that all the native languages of the New World with the exception of the Eskimo-Aleut and Na-Dene families formed one vast macrofamily, dubbed Amerind”. He was also the original proponent of “Eurasiatic” – a hypothetical genetic grouping similar to the older concept of “Nostratic”, though not identical with it. Greenberg’s successors have boldly extended his methodology to the study of the world’s languages, not only grouping them into one global phylogeny, but also arriving at twenty-seven examples of “global etymologies” labelled with approximate reconstructions (Bengtson & Ruhlen 1998). This is quite surprising, since according to their own principles comparative reconstruction is a separate technical task, not required for a correct classification. Nevertheless, mass-comparatists often propose impressionistic reconstructions, and even compile etymological dictionaries where hundreds of such reconstructions are offered (cf. Greenberg & Ruhlen 2007). They may be marked with an asterisk just like the legal products of the comparative method – a practice bound to confuse a non-specialist by creating the impression that some actual reconstructive work has been done.

In the posts to follow I shall focus on Bengtson & Ruhlen’s Global Etymology #27, ʔAQ’WA ‘water’. I intend to show, first, how Indo-European words meaning ‘water’ are analysed with the help of the standard comparative method; then, how Nostratic linguists handle data extracted from several families (including IE) to reconstruct a putative common proto-word at the macrofamily level; and finally, how mass-comparatists identify a global etymology (and restore the form of the corresponding word).

References
Greenberg, Joseph H. & Merritt Ruhlen. 2007. An Amerind Etymological Dictionary.  Stanford, CA: Stanford University Press. [PDF]
Ruhlen, Merritt & John D. Bengtson. 1998. “Global etymologies”. In Merritt Ruhlen, On the Origin of Languages: Studies in Linguistic Taxonomy. Stanford, CA: Stanford University Press. [PDF]
 [► Back to the beginning of the Proto-World thread]

20 comments:

  1. Congratulations! You chose a very interesting word.

    ReplyDelete
  2. I was partly inspired by your blog entry about "aqua".

    ReplyDelete
    Replies
    1. The thing is the meaning 'water' is too loose for being useful in long-range comparisons, having many semantic associations (e.g. 'running water' vs. 'still water'). Add to this chance phonetic resemblance and you'll get a real nightmare.

      Although technically different, it's amazing to see the "classical" comparative method (when applied to long-range data) and Greenberg-Ruhlen's "global etymologies" produce quite similar results: macro-families dating back to roughly 15,000-12,000 BC, a shallow chronology to be anything near a supposed Proto-World.

      Delete
    2. In the case of 'water', I think "imprecise" would be a better term than "loose" or "broad".

      On the other hand, generic meanings can evolve to more specific ones (narrowing) and viceversa (broadening). Latin aqua 'water' from Paleo-European *ɑkw-ā 'running water' is an example of the latter.

      Those reconstructions to which alluded before would give us at best a rough idea of what Mesolithic people spoke in different parts of the world, but by no means they constitute an evidence of genuine language relationships.

      Delete
    3. Although technically different, it's amazing to see the "classical" comparative method (when applied to long-range data) and Greenberg-Ruhlen's "global etymologies" produce quite similar results

      It would be amazing if the results had been achieved independently, but of course "globalists" often use the forms supplied by "long-rangers" as bona fide data. That's how words like aqua get promoted to IE, Nostratic and eventually World status. But please have a little patience. I'm travelling and working today and have little time left for blog posts. I'll try to continue as soon as possible.

      Delete
    4. Long before Ruhlen, the Italian linguist Alfredo Trombetti posited a number of "global etymologies" in his book L'unità d'origine del linguaggio (1905).

      Delete
    5. I should remark that Ruhlen's data includes an Afrasian 'water' word which could be a genuine cognate of IE *h1ēghº- 'to drink' (º denotes labialization), also quoted by Ruhlen (although he doesn't meantion the IE protoform).

      Delete
  3. The Germanic reflexes are I think particularly interesting, having been marginalized by *wed-.

    ReplyDelete
    Replies
    1. I meant to say the modern Germanic reflexes.

      Delete
  4. I find it interesting that French /o/ still functions jolly well as a content word.

    ReplyDelete
    Replies
    1. And /u/ too, at least in Quebec French. Quite a comedown from Augustus.

      Delete
  5. /t/ is often restored, and /a/ may be retained dialectally, but A(u)gustu(m) > /u/ is an example I always give to my students to show them what a victim of phonetic attriction looks like.

    ReplyDelete
    Replies
    1. There also seems to be a tendency, at least in some linguistic contexts, to "abandon" words that are too phonetically minimal.

      E.g., I've heard that one possible reason why the Old English word æ "law" was replaced with Old Norse lagi is that the former was getting too phonetically small to be easily intelligible as a content word. A similar explanation could account for the loss of OE ea "river" in favor of the Romance-derived word that it now uses.

      Are there any theories on why the Old English words were replaced, while the French terms have remained in use to this day (if I'm not mistaken, it has been several centuries since the French words were contracted into single vowels)? Does it have to do with the elite of Old English society being non-OE-speaking at the time that æ and ea were edged out?

      Delete
    2. I wonder if the difference between a stress-timed and a syllable-timed language can be blamed.

      Delete
  6. Is that attriction a slip or a "low philological jest", as Tolkien said of the name of his dragon? If the latter (or even if not), see this list of self-referential linguistic terms.

    ReplyDelete
  7. It was just a typo, but thanks for the list! It looks fairly complete, though one could add compound-formation, lithping, pentasyllabic, ssssound ssssymbolism, vocawization, and maybe a few more.

    My favourite: sibboleth (just visualising the consequences).

    ReplyDelete
  8. There is a "no content word with less than three letters" rule in English, with a few modern exceptions like ax.

    It seems clear that abeille 'bee', which is Occitan, displaced the Francien word because that would have become simply /e/. Similarly, the now unmeaning morph -zi (once 'child, offspring, seed') got attached to a great many nouns in modern Mandarin because the wholesale collapse of phonological distinctions made for too many homonyms.

    ReplyDelete
  9. But there are quite a number of minimal content words of the structure VC or V: (inn, egg, edge, etch, err, eye, awe, owe, ill ...), so it's a purely orthographic dislike.

    ReplyDelete
  10. I wonder why Greenberg's classification of African languages has become more widely accepted than his Amerind or Eurasiatic proposals, despite being possibly just as controversial.

    ReplyDelete