11 August 2014

De-Extinction: The Mammoth Walks Again

A word has a definable function if speakers regularly select it to convey a certain meaning (or, more generally, to achieve a certain communicative effect). As long as they have a reason to do so, a word  remains useful and there is a good chance that it will stay in circulation. A word which is used frequently will be transmitted to new users more reliably, especially if its function is easy to infer from the way it is used. Low-frequency words are prone both to semantic change and to lexical replacement: new speakers may quite accidentally fail to hear them used, or encounter them only occasionally in a context which doesn’t quite clarify their meaning. Word death is mostly due to accidental transmission breaks happening too often.

If historical linguists had any say in the matter, I’m sure that time-honoured words, priceless as evidence of language history, would enjoy special protection, and every care would be taken that they should be saved for posterity (no matter if we still need them for everyday communication). Alas, linguists have no such authority. It’s common usage plus quirks of fate that ultimately decide whether a word will die or survive.

A word already dead in spoken language may occasionally come back to life. Talking of fate and its quirks – here is one well-known case.

The descendant of the Old English noun wyrd ‘fate, destiny, fortune’ was practically extinct by the sixteenth century, ousted by its Latinate synonyms. It lingered on in Scotland long enough to be used by John Bellenden (in the 1530s) in his Scots translation of a Latin version of the story of King Duncan and Macbeth (published by Hector Boece a decade earlier). The three prophesying fairies which appear in the narrative, thought to be the supernatural “Fates” who control human destiny (comparable to the Greek Μοῖραι, the Roman Parcae or the Scandinavian Nornir), are called weird sisteris (literally = ‘the Fate Sisters’) by Bellenden. He didn’t invent the phrase; it can be found in earlier Scots sources referring to the three classical Fates.

The story told by Boece and translated by Bellenden was in turn adapted by the English chronicler Raphael Holinshed and his collaborators, and thus the weird sisters found their way into The Chronicles of England, Scotland, and Ireland. The second edition of that work, published in 1587, was Shakespeare’s source for the plot of Macbeth. Some confusion must have taken place in the process. Shakespeare turned Holinshed’s “goddesses of destinie, or else some nymphs or feiries” into repulsive old hags with “choppy” fingers, skinny lips, and even beards to boot. Shakespeare and the compositors of the First Folio (1623) were apparently puzzled by the unfamiliar word weird. The original phrase underwent deformation into weyward or weyard sisters; the first word was possibly taken for an adjective similar to wayward, and pronounced as two syllables (although exactly how Shakespeare understood it and whether he actually confused it with wayward are moot questions). Later editors “restored” the spelling used by Holinshed and his Scots source (but not by Shakespeare), bringing back the form of weird, but not its original function. Like an Egyptian mummy from old horror films, weird rose from its tomb and strutted about, half-resurrected but not sure what to do in the modern world.

Nineteenth-century readers and playgoers deduced the meaning of weird from what they saw on the stage. They were shown three “Weird Sisters” portrayed as grotesquely hideous witches, bizarre and unearthly. “Ah,” thought the audience, “so that’s what they mean by ‘weird’.” Before long, weird became a popular adjective to describe anything strange or uncanny. Crucially for its further spread, it managed to colonise the colloquial register of English, in which there is a constant demand for new emotionally coloured words to replace those that have become hackneyed. A function was apparently there, waiting for a suitable word to express it. What remains of Old English wyrd is just the form, like an empty shell, co-opted for completely new grammatical and semantic uses. Those who would like to clone the mammoth should draw a lesson from it.

The life restoration of a 17th-c. word.
De-extinction can happen in various ways. The word twat, gone obsolete for about a century, was excavated by Robert Browning and mistaken for something entirely innocent (the context was again not clear enough and could suggest a nun’s headgear; see here and here at Language Log). Browning’s naive mistake was later exposed by the Wise Clerks of Oxenford, much to the delight of those who heard of it, and the seventeenth-century four-letter word came back to life, regaining even its high obscenity index. It’s probably far more frequent now (especially in British English) than it ever was in its former heyday. Please consider this cautionary example before you de-extinct the thylacine.

My personal favourites among the words that should have been saved (but were not) are old kinship terms. Proto-Indo-European had a large and complicated system of names for different kinds of family relations. Many of them were still used in Old English, but only a handful have survived till now ­–­ those refering to the closest biological relationships (mother, father, sister, brother, daughter, son, all of them with impeccable PIE pedigrees, even if sister was touched by Old Norse influence). A few have been substituted by terms borrowed from French (aunt, uncle, niece, nephew), also traceable back to PIE, but acquired second-hand. Note, by the way, that while Old English ēom, for example, referred specifically to a maternal uncle in the strictest sense (the brother of one’s mother), an uncle could be maternal or paternal already in Middle English. Furthermore, uncle may refer to the husband of one’s aunt (again maternal or paternal) – not even a blood relation. We are dealing here with a new system replacing an older one, not just a series of lexical replacements.

The boringly transparent “in-law” terms have replaced the Old English words for affinity relationships. Not a single one has survived. All that mattered in the late Middle Ages was the degree of affinity as defined by the Code of Canon Law (which prohibited sex and marriage between some people so related), and the “in-law” terminology made that explicit. Gone are such beautiful Old English relics as snoru ‘daughter-in-law’ (from PIE *snusós) and tācor ‘the brother of one’s husband’ (note that only a woman could have one) – one of the four kinds of brotherhood-in-law possible today. The latter word has relatives in Indo-Iranian, Balto-Slavic, Greek, Latin, and Armenian. The PIE stem is usually reconstructed as *dah₂iwér-, but the details of its development in some branches of the family (including PGmc. *taikuraz and its historical reflexes) are not quite clear, making it especially interesting.

Couldn’t we revive those forgotten kinship terms, just for fun? Well, I don’t think the two just mentioned would have much chance of success. Had snoru developed regularly, it would be *snore today, and I doubt if any woman would find such awkward homonymy acceptable. Tācor, in turn, would have become Modern English *toker. Unfortunately, such a form (orthographic and phonetic) is no longer up for grabs. We find it in the lyrics of “The Joker” (by the Steve Miller Band):
I’m a joker
I’m a smoker
I’m a midnight toker...
and it doesnt mean an Anglo-Saxon brother-in-law.


  1. Another pleasing case of revival through Scots is raid, which like road, is < OE rād 'riding', the nominalization of (h)ridan 'ride'. The latter is English, and underwent the (pseudo-) sound change ā > ō. In Scots, this change was inoperative, and the semantic shift from nominalization to true noun didn't happen either, so the Scots version of the Great Vowel Shift gave us raid with the sense 'mounted foray, predatory expedition on horseback' (thus the SND). Sir Walter Scott introduced the word into English, where it has grown to have an extended sense: 'sudden or vigorous attack or descent upon something for the purpose of appropriation, suppression, or destruction' (thus the OED).

    Although niece and nephew are French, the native words they displaced, nift and neve (cf. German Nichte, Neffe) were very similar. I suppose this is the consequence of the sound-changes separating Germanic and Romance not happening to affect these words very much.

    1. Yes, the 'nephew' and 'niece' words in Germanic and Romance are of course related (from PIE *nep(o)t- and its femininised counterpart *nept-ih₂-). Less obviously, uncle and ēom are also partly cognate! But as I said, it's a case of system change. PIE *népōts and the Germanic words derived from it meant not only 'nephew' but also 'grandson'. So did Latin nepōs and, initially both Middle English neve (inherited, from OE nefa) and neveu (borrowed from French), but the introduction of the "grand-" system caused them to narrow down their meaning. Here, for once, a more general term was replaced by more specialised ones.

    2. A small correction: the OE 'ride' verb was just rīdan (rād, ridon, riden), without an initial h-. But raid is a nice example, thanks for it. Interestingly, OE rād meant all these things ('riding' as an activity, 'raid', and 'road'). Scots and English have developed more specialised senses independently.

    3. The OED3 s.v. ride lists hrīdan as a rare spelling, for what it's worth. I suppose it could be a scribal glitch, though.

  2. Crap, my comment disappeared when I logged in. It used to say "you will be asked to log in after submitting your comment"... no longer.

    Let's see if I can reconstruct it.

    Please consider this cautionary example before you de-extinct the thylacine.

    Frankly, that would totally be worth it.

    All that mattered in the late Middle Ages was the degree of affinity as defined by the Code of Canon Law (which prohibited sex and marriage between some people so related), and the “in-law” terminology made that explicit.

    Interestingly, while German has developed such a system as well, it hasn't made it quite so transparent and uses one of the old terms for it: "brother-in-law" is Schwager, "sister-in-law" is Schwägerin, and all the other terms for in-laws are formed with Schwieger- functioning as "-in-law". Apparently there are dialects where Schnur survives in the meaning "sister-in-law"; it's a homonym of... the category between "string" and "rope".

    The PIE stem is usually reconstructed as *dah₂iwér-, but the details of its development in some branches of the family (including PGmc. *taikuraz and its historical reflexes) are not quite clear, making it especially interesting.

    You're alluding to Cowgill's law in Germanic, right? I can't remember where, but I once read about an alternative hypothesis: sometime after all laryngeals had disappeared in the ancestry of Germanic, an epenthetic /g/ was inserted into /jw/ clusters, and then Grimm's law turned it into /k/ as usual.

    In Scots, this change was inoperative

    Thanks for confirming the suspicion I formed a few days ago in Scotland! :-)

  3. Schnur

    Cord? Twine?

    Shnur or shnir means 'daughter-in-law' in Yiddish.

  4. David: You're alluding to Cowgill's law in Germanic, right? I can't remember where, but I once read about an alternative hypothesis: sometime after all laryngeals had disappeared in the ancestry of Germanic, an epenthetic /g/ was inserted into /jw/ clusters, and then Grimm's law turned it into /k/ as usual.

    That's Seebold 1982 (Indogermanische Forschungen 87). According to Elmar Seebold, *w > *g between any non-sylabic sonorant (glide, liquid or nasal) and *u (by dissimilation). That happened after the loss of laryngeals but before the vocalisation of syllabic liquids and nasals (and of course before Grimm's Law), so that *daiwr̥- became *daiwur- > *daigur- > *taikur(az). It makes more sense to me than Cowgill's Law, and I think is supported by more examples. Besides, variation like [w] ~ [ɣ] ~ [g] is pretty common. I wonder if you have heard about Seebold's rule on this blog (see here).

    1. Oops, self-correction:

      That happened after the loss of laryngeals and after the vocalisation of syllabic liquids and nasals, but before Grimm's Law.

  5. Interestingly, while German has developed such a system as well, it hasn't made it quite so transparent...

    Ditto for Polish. We have lost Proto-Slavic *snъxa 'daughter-in-law' and *děverь 'husband's brother' together with most other inherited affinity words (the full set was still used in Old Polish but only a couple have survived and another couple may count as obsolescent though not completely lost). They were replaced either by new, transparently derived terms (like synowa 'daughter-in-law', a possessive derivative of syn 'son') or by loans (that's the case of szwagier 'brother-in-law [generalised]', borrowed from German).

    1. Bulgarian still has снаха (snaha) and девер (dever) for these kinship terms. I know that Russian has сноха (snoha) as well.

    2. Yes, these and more, e.g. Bulg. šurej, Russ. šurin for another kind of brother-in-law (wife's brother). Old Polish had the same words: snecha, dziewierz, szurzy etc. The first two had cognates in Old English: snoru, tācor

  6. That's interesting about Yiddish!

    Cord? Twine?

    Maybe, but "cord" makes me think of something bigger and twine of something smaller... I need to do some research.

    I wonder if you have heard about Seebold's rule on this blog (see here).

    That's possible!

    It makes more sense to me than Cowgill's Law, and I think is supported by more examples.

    It's more parsimonious in not requiring the very late survival of laryngeals in the position where they'd have been the most difficult to pronounce, and in the thread you link to you said "there are maybe half a dozen examples" – apparently Cowgill's "law" only has three, and all of them can be explained by Seebold's.

    the full set was still used in Old Polish but only a couple have survived and another couple may count as obsolescent though not completely lost

    Until rather recently, literary German used Vetter (m.) and Base (f.) for "cousin". Their original meanings, however, are "father's brother" and "father's sister"! Oheim, poetic for "uncle", once meant "mother's brother", and Muhme, originally "mother's sister", occurs in this book from 1957 for an indeterminable female relative.

    Vettern still means "real or metaphorical relatives", but isn't used much anymore.

  7. It's more parsimonious in not requiring the very late survival of laryngeals in the position where they'd have been the most difficult to pronounce,

    Note that *dah₂iwér- is usually reconstructed with the laryngeal before *i. It's needed to colour the adjacent vowel. I'm not sure if it can explain the Balto-Slavic root accent without stretching the applicability of Hirt's Law. If it can't, I'd prefer to reconstruct the stem as *daiwér-, with a fundamental a-vocalism and no laryngeal at all. The alternative reconstruction *daiHwér- makes no sense at all. It doesn't allow Hirt's Law to operate in Balto-Slavic, and although the laryngeal doesn't colour anything, we still have to reconstruct *a in the first syllable. The only potential advantage of this reconstruction is that it might explain Gmc. *k via Cowgill's Law. But if there is a better alternative to Cowgill's Law, the gain is illusory, and Ockham's Razor applies.

    and in the thread you link to you said "there are maybe half a dozen examples" – apparently Cowgill's "law" only has three, and all of them can be explained by Seebold's.

    I don't think it explains *kʷikʷa- 'living, alive' (which doesn't worry me; I have a different idea about it). It does explain a few problematic words, such as *aikur-na- 'squirrel' < *(w)aiwr̥- (vel sim., cf. Balto-Slavic *waiwer-).

    1. Oops, sorry; it's with Seebold's hypothesis plus your paper on "the meaning of life" that the three examples are all explicable. I shouldn't comment quite so late at night. :-)

      So Eichhörnchen is cognate with veverica! That's fascinating. :-)

      Incidentally, what is the evidence that Proto-Balto-Slavic hadn't yet turned *w into [v~ʋ], which is what all attested Baltic and Slavic languages have? Is it just the fact that medved isn't **medojed?

    2. Proto-Slavic *v is reflected as a bilabial approximant [w] (merging with vocalised "dark" *l) in Lower and Upper Sorbian, and /v/ has a bilabial allophone in syllable codas in Slovene, Slovak, Ukrainian, and Belarusian. Even where it's a fricative, /v/ still tends to pattern phonologically with sonorants rather than fricatives. For example, it doesn't cause regressive voice assimilation in /kv/-type clusters, and if assimilation does take place, its direction is untypical (/kf/, not /gv/). All this suggests that even Proto-Slavic *v was still an approximant (despite the traditional spelling). The "hardening" of [w] to [v] is commonplace and has taken place convergently in lots of languages. After all, if it weren't for the testimony of English (OK, with some support from Danish and Dutch), one might suspect that *w > *v took place in common Northwest Germanic, judging from its modern distribution.

    3. All this suggests that even Proto-Slavic *v was still an approximant

      Yes, but which one – [w] or [ʋ]?

      In Germanic, [ʋ] is at least as widespread as [v] today: Dutch (except West Flemish, which retains [w]), broadly northern German (for instance here in Berlin), at least West Frisian, Danish, Norwegian, more or less Swedish... there are also Austrians who at least have it as an allophone, as I found out to my surprise during the BAWAG scandal (/ˈbaːvag/, Bank für Arbeit und Wirtschaft), which seemingly all TV and radio reporters pronounced with some kind of diphthong (not the same as au, but similar).

      Myself, I seem to articulate it as a strongly nasalized [ṽ]: there's a wide contact between lip and teeth, and no effect on surrounding vowels, but I can't feel any friction. Maybe I'm actually avoiding the tickling feeling of voiced friction by letting the air out through the nose.

      /v/ behaves like an approximant all over the place: like /j/ it can't occur at the end of a syllable, and therefore 1) lacks a long counterpart even in Upper German, where consonant length is generally retained, and 2) doesn't occur behind short vowels in Standard German.

      Other than in English and West Flemish, [w] is reportedly retained in the "Swedish" dialect of Älvdalen, in at least one Walser dialect in (or rather high above) the Aosta valley, and behind orthographic u in Sater Frisian.

    4. J. Knobloch (Die Funktion des Schwagers usw., Arch. Glott. Ital. 77:86-8, 1992) argued that since the groom's brother led the bride away from her family, his designation was 'Trenner', and the noun was based on *dah2(i)- '(zer)teilen'. Since this root, for whatever reason, appears in various languages with and without an /i/-extension (much like *dHeh1(i)- 'to suck(le)'), it seems possible that the noun exhibited similar variation, with *dah2i-wer-, *dah2-wer- becoming *daiwer-, *dagwer- after Cowgill's (Gmc.) Law, and *taiwer-, *takwer- after Grimm's Law. Perhaps a compromise-form *taikwer- is reflected in the Germanic languages.

      W.P. Lehmann (PIE Phon. 50-1) similarly proposed a compromise, but his mechanism is unsuitable for "modern" PIE, since it depends on reduced grade and unsupported loss of the final element of long diphthongs in syllable-final position.

    5. David: Yes, but which one – [w] or [ʋ]?

      Possibly both, for all we know. Such phonetic details are rarely reconstructible. I use *w for Proto-Balto-Slavic also with the same caveat.

      Douglas: Knobloch's etymology is of course reasonable (in several Slavic languages reflexes of *děverь mean not only 'husband's brother' but also 'best man'. I still think Seebold's phonological explanation is more promising than any alternative I know of, and it works even if one remains agnostic about the presence (and the exact position) of a laryngeal. It certainly works for *dah₂i-wer- (the details of its morphological derivation are still not entirely clear, since *-wer- is hardly a productive agent suffix).

  8. The Mycenaean forms _o-to-wo_, _o-tu-wo_, _o-two-wo_ indicate that Attic-Ionic _orthós_ 'upright' continues earlier *orthwós. Thus _pentherós_ 'father-in-law' (originally only 'wife's father' according to Aristophanes of Byzantium) can continue *penthwerós, a thematization of *penthwer- with accent after _hekurós_ 'father-in-law, step-father'. This can reflect PIE *bHendH-wer- 'one who binds (his daughter formally to her husband)'. Like *dah2i-wer- this appears to contain two full grades and is reminiscent of 'four'. If Hitt. _kutruwenes_ 'witnesses' is related to 'four', the root is likely *kWeth1- 'to point (out)' and the nt. coll. *kWet(h1)wó:r (with laryngeal deletion by de Saussure's Effect) 'set of pointers, fingers excluding the thumb' became 'set of four' underlying Go. _fidwor_, Lat. _quattuor_. In most IE lgs. this was substituted by the pl., nom. *kWét(h1)-wor-es, acc. *kWth1-wér-n.s, obl. *kWth1-ur- (ord. *kWth1ur-th1ó-) with various remodellings, usually including replacement of zero root-grade. The original force of *X-wor-es would have been 'members of a group which does X'. The attested singulars _da:é:r_, _pentherós_ in this view represent remodellings of the acc. pl. stem *X-wer-.

    Knobloch (op. cit.) agreed with Fraenkel (LitEW 331) that Lith. _láig(u)onas_, _laigônas_ 'wife's brother' belongs with Lat. _liga:re_ 'to tie' and Grk. _loigo:ntían_ = _phratrían_ acc. sg. 'body of kinsmen, clan' (Hsch.). He argued that the ritual in which the groom's brother separated the bride from her old family was followed by another in which the bride's brother formally tied her to her new family. In these relationships *bHendH- apparently referred to the union of bride and groom sanctioned by the bride's father, and *leig- to the bride's union with her new family, ritually consummated by her brother. "Blessed be the tie that binds."

    If the original term for 'wife's brother' was not an /o/-grade derivative but the /e/-grade *leig-wer-, it would help explain the PGmc compromise-form *taikwer- which I proposed in my earlier comment. Prosodic symmetry with *leikwer- would have favored *taikwer- over *taiwer-, *takwer-, and the rival compromise-form *tawer-. Moreover, *leig-wer- would have become Old Latin *leiver-, explaining _laevir_, _le:vir_ as a contamination of inherited *daiver- with *leiver- (and later _vir_ of course). I should mention that Pokorny (IEW 179), without benefit of Cowgill's Law or Seebold's alternative, proposed that the Gmc. *k in OE _ta:cor_, OHG _zeihhur_ arose by crossing with a cognate of Lith. _láigonas_.

    1. If Cowgill's Law operated without restriction, we would expect prevocalic reflexes of PIE *nah2w- 'ship' to show *nakw- in Gmc., even if the declension were shifted like Lat. _na:vis_ (/i/-stem, not C-stem). In fact ON _nór_ 'ship' and _Nóa-tún_ 'Njord's dwelling, Shiptown' point to a thematized Gmc. *no:wa-, earlier *náh2wo-, on which CL did not operate. The old zero grade of the C-stem appears in ON _naust_ 'shiphouse' from PIE *nh2u-sth4óm.

      OE _naca_, OS _naco_, OHG _nahho_, and ON _no,kkvi_ 'boat' reflect Gmc. *nakwan-. Pokorny (IEW 770) cited Skt. _nágah._ 'mountain; tree' as cognate. Mayrhofer (KEWA 2:125-6) considered this "schwerlich ... richtig" because _nágah._ as 'mountain' occurs earlier than 'tree', whereas Gmc. *nakwan- must originally have meant 'log canoe'. He preferred to connect *nakwan- with the PIE for 'ship', referring to Lehmann (PIE Phon. 49). However, Lehmann did not explain why the laryngeal lengthened the foregoing vowel in _Nóa-tún_ but merged with the following */w/ in _naca_, and he did not account for the -v- remaining in _no,kkvi_.

      Evidently if Cowgill's Law is to work, the cluster *h2w or *h3w cannot have thematic vowels on both sides. 'Quick' has */i/ on one side. For the 'boat' word I presume amphikinetic inflection in PIE:

      NSg. *náh2wo:n
      ASg. *náh2wonm.
      GSg. *nh2unós
      LSg. *nh2wéni

      Gmc. vocalized laryngeals generally became */a/ in the (first) root-syllable, otherwise */u/. I assume that in the earliest PGmc, when syllabic resonants acquired epenthetic */u/, interconsonantal laryngeals acquired a (high) schwa */@/. 'In the boat' thus became *n@h2wéni, and I assume CL could operate, yielding *n@gwéni, so this was the pre-Grimm's Law paradigm:

      NSg. *ná:wo:n
      ASg. *ná:wonum
      GSg. *nunós
      LSg. *n@gwéni

      I now assume that this unusual paradigm was split. With boats used for carrying people, the loc. sg. 'in the boat' had high frequency in speech, and a new paradigm *n@gwon- was modelled on it. This became after Grimm's Law *n@kwon-, after Verner's Law and root-accent generalization *nakwan-. With other boats, the loc. sg. had much lower frequency than the acc. sg., and the paradigm *na:won- was normalized. This is the source of OHG _ver-nawun_ 'boats that carry wood'.

    2. PIE *nh2u-sth4óm

      ...Also intriguing: *h4. Which version of it do you mean? There seem to be several different hypotheses that use this symbol. :-)

    3. What I mean by *h4 is the /a/-colored laryngeal which produces aspiration of tenues in Sanskrit. The other one, *h2, shows up explicitly in Hittite, but *h4 only leaves coloration there.

      The derivation of Gmc. *nakwan- suggests extending Cowgill's Law to the "weak" or "breathy" laryngeals in anamphithematic intervocalic position, viz. *-h1w- and *-h4w- become pre-Grimm's Law *-gHw-. One illustration is 'bridge'. Gaulish _bri:va_ presupposes PIE *bHréh1wah2, while ON _brú_ (earlier *bró < Gmc. *bro:wo:) points to *bHroh1wáh2 with the cluster in amphithematic position.

      I take Swiss German _Brügi_ 'wooden framework, stick-built structure' as peripherally continuing OHG *brugi, *prugi < Gmc. *brugi:, a zero-grade fem. like PIE *wl.kWíh2 'she-wolf' (> Skt. _vr.kí:_, ON _ylgr_). The PIE nom. sg. *bHrh1wíh2 would become in the earliest PGmc *bHr@h1wíh2, then by extended CL *bHr@gHwíh2. I must now assume that the epenthetic high schwa */@/ acquired its historical coloring sometime before the accent became fixed on the root-syllable: */a/ under the accent, */u/ otherwise. This plus Grimm's Law and postvocalic laryngeal absorption would yield *brugwí:, unchanged by Verner's Law.

      'She-wolf' was *wulgWí: in North PGmc immediately after Verner's Law. Oblique cases had *wulg(W)j- with regular loss of the labial component, leading to a new nom. sg. *wulgí:, becoming Proto-Norse *wulgi, which acquired *-R in Runic Norse after animate /i/-stems; RN *wulgiR > ON _ylgr_ is regular. I assume oblique *brug(w)j- likewise led to nom. sg. *brugí:, (Upper) OHG *brugi, *prugi.

      'She-wolf' in WGmc dialects comes from *wulbí: whose pre-Verner's Law form had *f for *xW after 'he-wolf'. (I take this *f as arising regularly in the nom. sg. by contact assimilation after stem-vowel syncope, *wúlxWos > *wúlxWs > *wúlfs, and spreading to the other cases due to the high usage frequency of the nom. sg.) The oblique cases underwent WGmc /j/-gemination to *wulbbj-, leading to a new nom. sg. *wulbbi:, Proto-HG *wulppi, transferred to the them. fem. decl. as OHG _wulpa_ (but MHG _wülpe_ < OHG dial. *wulpi); similarly Early OE *wylb(e), OE _wylf_.

      Likewise we have (after obl. cases) Proto-HG *brukki: 'bridge', thematized as OHG _brucka_ (but MHG _brücke_ < OHG dial. *brucki); similarly OE _brycg_, OS _-bruggia_ 'bridge'; ON _bryggja_, LG _brügge_ 'pier'.

    4. Interesting. I knew about the idea of postulating *h4 to account for Hittite words with a which can't be derived from *e or *o, but I didn't know that any of those coincide with a Sanskrit voiceless aspirate!

      The PIE nom. sg. *bHrh1wíh2 would become in the earliest PGmc *bHr@h1wíh2, then by extended CL *bHr@gHwíh2.

      ...Do you think *h1 was [ɦ]?

      I take this *f as arising regularly in the nom. sg. by contact assimilation after stem-vowel syncope, *wúlxWos > *wúlxWs > *wúlfs

      Huh. The textbook/Ringe reconstruction is *wulfaz without syncope. Is *xʷs > *fs regular?

      Given Latin lupus, I had wondered about taboo defomation turning *kʷ into *p in Proto-WIE; I can't find a Celtic cognate...?

      MHG _wülpe_

      Also wülbe, according to something I once read somewhere.

      thematized as OHG _brucka_

      That might explain the lack of umlaut in this word in... something like Upper German dialects today.

    5. Interesting. I knew about the idea of postulating *h4 to account for Hittite words with a which can't be derived from *e or *o, but I didn't know that any of those coincide with a Sanskrit voiceless aspirate!

      I don't think this idea is defensible. *h₂ is not reflected as h in Hittite after stops, but it does leave traces there, also where Sanskrit shows "laryngeal" aspiration. A good example is the 2sg. ending of the hi-declension, Hittite -tti in vocalic stems < *-th₂a(i) (Skt. -tha), where both the gemination and the absence of palatalisation are the expected effects of "plain" *h₂.

    6. Oops, I forgot to mark the first paragraph above as a quotation.

    7. And of course "conjugation", not "declension". I need another cup of coffee.

    8. Piotr: I find that trilaryngeal theory fails to account adequately for the discrepancy between Skt. _pitár-_ and _sthitá-_. Fathers feed and protect their families, and Hittite securely reflects *pah2-, so I have no problem with *ph2tér-. But 'stand' must be *stah4-, which is why I reconstruct ON _naust_ 'shiphouse' as *nh2u-sth4óm. I must conclude that Hitt. -tti beside Skt. 2sg. pf. act. -tha continues *-th4a(i). Since the 3sg. /hi/-conj. suffix does not even contain a laryngeal, there is no reason to insist that the 2sg. must have the same laryngeal as the 1sg. *-h2a(i). Moreover, if *h2 did produce Indic aspiration, the Skt. 1sg. pf. act. would regularly be aspirated with tenuis-final roots, e.g. **sus.vapha 'I slept'. Analogy would be unlikely to efface this because it would involve a simple surface-rule, unlike Brugmann's lengthening in the 3sg. against 1sg., which did not occur with all roots and cannot be described by a surface-rule, yet it persisted into historical Sanskrit.

      Mayrhofer's assignment of Avestan f- in some 'father' forms to *pH- resulting from *ph2-, as opposed to *p(i)- from *p&2- < *p@h2-, explains nothing (Fortsetzung der idg. Laryngale im Indo-Iran. 118). After quoting Benveniste to the effect that Avestan is more faithful than Vedic to archaic forms, Mayrhofer lists the sg. 'father' forms from the RV, all of which have -i- < *h2. He then lists the Avestan forms, which according to him show an archaic alternation between *pitar- and *p(H)tar- (reflecting *p&2ter- and *ph2tr-, EWA 2:129). The /f/-forms are Old Av. dat. sg. _f@ðro:i_ /fþrai/ beside _piþre:_ (= Skt. _pitré_) and Young Av. acc. pl. _f@ðro:_ (= Skt. _pitr´:n_). Loss of */i/ from an unstressed laryngeal between stops is regular here, so the Av. forms with pi- must have been contaminated by the vocative. Since Av. þr- from *tr- is regular, it stands to reason that fþr- (written _f@ðr-_) from *ptr-, earlier unstressed *pitr- < *ph2tr- as in Skt., is also regular. The Av. /f/-forms thus imply nothing about an archaic alternation lost by Vedic, and they do not lead to an explanation for the absence of aspiration in _pitár-_. The simplest explanation is that *h2 did not produce aspiration in the first place.

      Mayrhofer reflexively ascribes all such Indic aspiration to *h2, even when cognates point to *h1. Thus he reconstructs Skt. *ásthi/n- 'bone' as *h2ost-h2-n- (Fszg. 112), while Beekes recognizes that Grk. _ostéon_ requires root-final *h1, and reconstructs *h3esth1-i- (EDG 1119). Likewise, Beekes recognizes that Skt. 2pl. pres. act. -tha must represent *-th1e (Comp. IE Ling. 232-3). Unfortunately, like other Leiden scholars, he is still trapped in the trilaryngeal straitjacket. I group *h1 and *h4 as "soft" or "breathy" laryngeals, aspirating PInIr tenues but not showing up as consonants in Hitt. intervocalic position. By contrast *h2 and *h3 were "strong" laryngeals.

    9. David: I think *h1 was most likely the glottal fricative [h]. It did not aspirate plain mediae in PIE or PInIr. In my opinion *h4 did acquire the allophone [h^] in PInIr immediately after a plain media and thus aspirated it, e.g. *meg^h4- 'great' (Hitt. _mekk-_) became PInIr *mejH- > Skt. _mah-_ with nt. sg. *még^h4 > Hitt. _mek_, Grk. _méga_, Skt. _máhi_.

      Textbooks ascribe nom. sg. *wulfaz wrongly to PGmc in my view; it belongs only to Proto-NORTH Gmc., where the stem-vowel /a/ was restored to the masc. nom. sg. by analogy with /i/-stems and /u/-stems (thus Runic Norse nom. sg. -aR, -iR, -uR in these stem-classes). East Gmc. (Gothic _wulfs_) retained the syncopated nom. sg., as did West Gmc., where direct evidence is lacking. However, on the basis of doublets like NHG _Affel_ against _Apfel_, we may infer that WGmc likewise retained the sync. nom. sg., PGmc *apl.s > PWGmc *apuls > Proto-HG *afful against the acc. sg., PGmc *aplaN > PWGmc *appla~ > Proto-HG *apful. Some dialects generalized the old nom. sg., others the acc. sg.

      In my opinion *xWs > *fs was regular, likewise *sxW > *sf, at a particular stage of PGmc when the phones in question were brought together by syncope or external sandhi. PGmc did not inherit intramorphemic sequences of this type. The /f/ of 'wolf' evidently arose in the nom. sg. due to syncope bringing *xW and *s together, when the inherited PGmc accent was replacing pitch with stress as its principal feature, sometime after Grimm's Law but before Verner's. Thus *wúlxWos > *wúlxWs > *wúlfs, with the /f/ eventually ousting /xW/ from the other cases of the masc. paradigm, due to the high usage frequency of the nom. sg. Final *-e was likely apocopated at the same time *-os was syncopated, reducing *fénxWe 'five' to *fénxW. In counting, final /xW/ was then brought into contact with the initial /s/ of 'six', changing the former to /f/. Likewise, initial /xW/ of 'four' was in contact with final /s/ of 'three', provided the entities being counted (in the nom. or acc.) had animate grammatical gender, which was true in most counting situations, and this /xW/ also became /f/. Eventually these changes in 'four' and 'five' due to counting-sandhi were generalized to the cardinal numerals used in isolation (since numerals are usually learned by memorizing a counting sequence) and most of their derivatives.

      For 'wolf', Celtic may have operated with the double zero-grade *ulkWo- seen in Alb. _ulk_. Isolated names like Old Ir. _Olcán_ are not compelling, but the tautology of Ulpi(us) Lupio on a Rhenish inscription (CIR 130) is. The gentilicium appears to be based on the Gaulish name *Ulpos 'Wolf', with the Latin cognomen 'Wolfie'.

      I prefer tabuistic substitution to deformation, which is too powerful. Beekes (EDG 875, 877) has pointed out the formal identity of Grk. _lúkos_ 'wolf' with Swed. _lo_ 'lynx' (Gmc. *luxa- < PIE *lúk^o-). WGmc had a variant *luxsa- > OHG _luhs_, OE _lox_ 'lynx'. Italic might have inherited another variant *lúk^wo-, usable as a tabu-substitute for 'wolf' due to the scarcity of lynxes in Italy (hence Latin borrowed the Greek word). This would regularly become P-Italic *lupo-, whereas _bo:s_ and _scro:fa_ show that Latin did borrow animal-names from P-Italic.

      I could find nothing on _wülbe_ 'she-wolf'. If it exists, it is probably a Low Germanism.

    10. I finally found my source for wülbe: it's this passage citing Kuiper (1995), which isn't online.

      F. B. J. Kuiper (1995): Gothic bagms and Old Icelandic ylgr. North-Western European Language Evolution (NOWELE) (25): 63–88.

    11. I have a copy of Kuiper's paper, heavily pencilled with my criticisms. (I scarcely agree with any of the conclusions, but some useful examples of Gmc. geminated mediae are given, so the paper is worth having.) Kuiper does not cite any MHG _wülbe_, only the usual _wülpe_ from a protoform *wulBjo:-. The Wikipedia author apparently conflated _wülpe_ with the printed protoform, in which crossed b represents B. (I usually avoid the phonetic distinction between [b] and [B] anyway, since we know too little about its chronology, and simply use phonemic /b/ in PGmc reconstructions.) So our _wülbe_ is a vox nihili.

      The Wikipedia author also seems to be unaware of Seebold's comprehensive treatment, "Die Vertretung von idg. gWh im Germanischen", KZ 81:104-33, 1967. I personally have no desire to start editing Wikipedia, since it would probably end up consuming all my spare time.

    12. I'll try to do that tomorrow, then. :-) We can't let voces nihili stand on Wikipedia!

    13. For some value of "tomorrow", heh. I just did it (mostly on the talk page) – and I found Seebold '67 on JSTOR, meaning I can read it as long as I put up with several annoying quirks.

    14. Problem:

      The oblique cases underwent WGmc /j/-gemination to *wulbbj-

      I can't believe I overlooked this for so long: Sievers' law makes that impossible, because **wulbj- would immediately become *wulbij-, making WGmc. gemination impossible; WGmc. gemination only operated on short vowel + consonant + /j/.

      That means we're left without an explanation for the p in wülpe. Could it just be an Old-Bavarian-like spelling with p for /b/ (straight from PGmc. *b, but a voiceless plosive as opposed to a voiced fricative)?

    15. A good point. Sievers' Law and its Germanic converse continued to operate as surface filters after the u-vocalisation of syllabic resonants.

    16. Finally:

      extbooks ascribe nom. sg. *wulfaz wrongly to PGmc in my view; it belongs only to Proto-NORTH Gmc., where the stem-vowel /a/ was restored to the masc. nom. sg. by analogy with /i/-stems and /u/-stems (thus Runic Norse nom. sg. -aR, -iR, -uR in these stem-classes). East Gmc. (Gothic _wulfs_) retained the syncopated nom. sg., as did West Gmc., where direct evidence is lacking.

      Quite the contrary. There are several West Gmc. rune inscriptions where /a/-stem nominatives end in -a: the -z had been lost, the -a was still there.

      (And then there are those "Frisian" inscriptions where they end in -u instead...)


    17. Sorry, I missed seeing the Sievers' Law comments in October. The widget must have been out of whack. I normally check this blog twice a week for new comments.

      This is my first attempt to use Unicode, so there may be some trouble. (It looks acceptable in preview.)

      Ringe's claim (PIE to PGmc 118) that Sievers' Law alternants "were of course inherited from PIE" is absolute balderdash (cf. Sihler, NCG 175-8). Sievers' Law should be viewed like Osthoff's, as an episodic phenomenon which can strike independently, and cannot rightly be generalized and projected back to PIE. Hopelessly infected by the generative virus, Ringe (120-1) is compelled to obscure the reality of the PGmc Sievers' episode involving *j behind such faddish terms as "surface filter". Thank goodness some of us are immune.

      The question before us is whether Sievers' Law could have been operating when WGmc /j/-gemination occurred, and the answer is "no". Beside MHG wülpe we have diupe 'female thief' showing PGmc *þeubj- > PWGmc *þiubbj- > PHG *þiuppj- > *diup(p)-. Furthermore OHG rinka 'fibula' beside (h)ring 'ring, circle' illustrates PGmc *xreŋgj- > PWGmc *xriŋggj- > PHG *xriŋkkj- > *hriŋk(k)- against *xreŋga- > *xriŋga- > *hriŋg-. Old Upper German leittan 'to lead' explicitly maintains the postheavy geminate of PWGmc *leiddjana- (PGmc *laidjanaN) otherwise reduced in OHG leiten, OS lēdian, OE lǽdan, etc.

      Runic WGmc texts with nom. sg. masc. -a in thematic nouns are not fatal to my theory of PGmc syncope in the corresponding forms. Doublets of 'apple' and 'acre' in OHG already illustrate how some dialects generalized the old nom. sg., others the unsyncopated acc. sg., to the new invariant nom./acc. sg. The PGmc forms were nom. sg. *apļz, *akŗz and acc. sg. *aplaN, *akraN. In PWGmc the syllabic liquids were vocalized and the tenues were geminated by the following nonsyllabic liquids, yielding nom. sg. *apulz, *akarz and acc. sg. *applaN, *akkraN, later *apul, *akar and *appla, *akkra. OHG afful and ahhar (with OE æcer) continue the nom. sg. forms, OHG apful (with OE æppel) and acchar the acc. sg. The Runic WGmc -a in question can be equated with the *appla-*akkra stage, with generalized acc. sg. The (Paleo-)Frisian -u likely shows the same Anglo-Frisian nasal-induced rounding of *-aN which we find in OFris nōmen, OE nómon 'they took' against OHG nâmun; OFris mon, OE monn 'man' against OHG man.

    18. Several good ideas. I'm now rereading the Google Books preview of Ringe & Taylor (2014) on the development of Old English from Proto-Germanic, and on p. 31 the following is given as an uncommented example:

      "PGmc *stubjuz 'dust' (Goth. stubjus) > OHG stuppi"

      This presupposes that Sievers' law did not operate, doesn't it?

      Interestingly, BTW, the modern German word for dust is *stūb- alone, without *-j-: Staub. I suppose the required *ū was shortened before the consonant cluster *bj.

      What do you have against surface filters, though?

      The Runic WGmc -a in question can be equated with the *appla-*akkra stage

      Or even the preceding one; nasality was never written, even *kamba shows up as kaba (assuming somebody did in fact write "comb" on a comb).

      The (Paleo-)Frisian -u likely shows the same Anglo-Frisian nasal-induced rounding of *-aN

      Why u, though, and not o? Other cases of short o were written with the o rune, too.


    19. Edgerton's Converse of Sievers' Law in Germanic is refuted by OE Deni(g)a 'of Danes' and wini(g)(e)a 'of friends' (cf. Erdmann, Suffixal j in Germanic, Lg. 48:407-15, 1972). The protoforms must have been *Dan-ij-ōⁿ, *wen-ij-ōⁿ, and if the Converse had operated, post-light suffixal *-ij- would have been reduced to *-j-, leading to WGmc gemination and OE ˣDenna, ˣwinna instead (beside analogical Dena, wina).

      Onomastic evidence does suggest that MHG wülpe is a Bavarianism. Old feminine personal names with a lupine second element were collected by Müllenhoff (Wolf und Wölfin, ZfdA 12:252, 1865). These include Odulba 774, Rihhulba 765-92, Heriulb (undated) against Waldulpia 719, Hruadulp 788, Perahttulpa 842, etc. The alternation points to underlying Gmc. *b. I must therefore retract my proposal involving WGmc gemination in 'she-wolf', as well as the oblique stem *wulbja-. The WGmc oblique stem must have been *wulbijV-, NGmc *wulgʷijV-, Pre-GL *wulxʷíjV- < *wḷkʷíH-(x) in accordance with the Vedic inflection of rathī́ḥ 'charioteer' (which in my view, argued in my recent academia.edu draft paper, has the same suffix as vṛkī́ḥ 'she-wolf'). The *-ij- here has nothing to do with Sievers' Law; its *j is a glide replacing a laryngeal.

      It sounds as though Ringe's conception of Sievers' Law has evolved. If so, good for Ringe.

      When writing is a novelty, yes, 'comb' may be written on a comb.

      What do I have against surface filters? I think the whole Chomskyan enterprise is a mirage. To explain (for example) English 'lightning' against 'lightening', we must resort to good old soundlaws and analogical processes, the REALITY of language as opposed to the FANTASY of deep structure. Surface filtration gets us nowhere.

      Runic Frisian -u for expected *-o (< *-aⁿ) in word-final position might be comparable to OE -u from PGmc *-ō in the strong nom. fem. sg. and nom./acc. nt. pl. of short-stemmed nouns (e.g. giefu 'gift', limu 'limbs'). I am not suggesting that the SAME EVENT was involved.

    20. The protoforms must have been *Dan-ij-ōⁿ, *wen-ij-ōⁿ...

      Not if the "collective" suffix was originally *-ejo- and the raising of suffixal *e postdates the productivity of the Sieversian alternations in Proto-Germanic. Slavic *-ьje as in *ljudьje is compatible with such an analysis.


    21. You've got me there. I can't prove the Converse didn't operate before the NWGmc j-umlaut of *e to *i (which postdated the borrowing of Fi. telja 'Ruderbank' = OHG dilla 'transtrum', ME thille 'thill'). Since the Danes constituted a people, it's conceivable that *-ej- was generalized throughout the paradigm, *Dan-ej- following *lewd-ej- < PIE *h₁lewdʱ-ej-. Friends don't technically form a people, but the fact that the OE nom./acc. pl. wine is more common than winas (as opposed to e.g. giestas), retaining i-stem inflection like líode, suggests that 'friend' may have traditionally patterned after 'people' (i.e. PGmc *wen-ej- like *lewd-ej-).

      Another questionable passage of Erdmann's involves the Gothic suffix -areis (e.g. bokareis 'scribe') borrowed from Latin -ārius (p. 412). Erdmann applies Sievers' Law to underlying *-ār-j-az, yielding *-ār-ij-az and Go. -areis (i.e. -ārīs) by syncope and final devoicing. This is probably incorrect in that the unwritten glide in the Latin suffix is ignored. What was borrowed was more likely -ārijos > *-ārijaz > *-ārijz by syncope, *-ārijs by devoicing, -ārīs (written -areis) by the regular development *ijC > īC in Gothic. This suffix shouldn't be used as corroboration that SL still operated in Gothic in the Common Era.

  9. Glad to see that you are back with a really interesting article! I've been considering a hammer for so long that I was approaching zen.


  10. more or less Swedish

    Also Icelandic, which I had shamefully forgotten to look up. Can't find detailed information on Faroese.

    If Hitt. _kutruwenes_ 'witnesses' is related to 'four'

    Interesting idea. What about the idea I read in Bomhard's book (it's probably not original to him, but I can't remember) that, in the light of Latin triquetrum "triangular", it may first have referred to corners, then to a square or rectangle, and then to 4?

  11. Corners are pointy, so Latin _triquetrus_ 'three-cornered' probably does involve the same root as 'four'. Gaulish *petros 'corner' (= Lat. *quetrus) was likely borrowed into PGmc before Grimm's Law, whereupon it became *feþro- and was used after Verner's Law to form *þrifeþra- 'three-cornered', surviving in OE _þrifeoþor_. To my knowledge, PIE-speakers did not make extensive use of square tiles, so I do not see how the sense 'four' could have developed out of 'corner(ed)'.

    The root underlying Italo-Celtic *kWetro- 'corner' could be *kWet- or *kWetX-, since de Saussure's Effect would delete a laryngeal before tautosyllabic *-ro-. I used to regard *kWetro- as derived from *kWet- 'to cut', which I consider reflected in Lat. _cossus_ 'intestinal worm, tapeworm' (i.e. 'segmented (worm), annelid' < PIE *kWot-tó-) and Gaul. *pettia 'piece' (Proto-Celtic *kWetto- < PIE *kWet-nó- 'cut (apart), broken'). Gaul. *petto- 'cut (off), broken', applied to landforms 'steep, precipitous', probably underlies several topographic terms: Gascon _petarro_ 'steep slope; hill', Basque _petar_ 'steep slope; steep path', and Galician _petón_ 'crest, peak'. But since 'four' requires a laryngeal, I now think *kWetro- also involves a set.-root *kWetX-.

    The original sense of *kWetX- was likely 'to make pointed', similar in sense to *h2ak^- 'to sharpen'. One explanation of 'eight' is the dual of a passive deverbative of *h2ak^-, namely *h2ok^tóh1(w) 'the sharpened pair' (of non-thumb fingers held pointedly together). Had the singular been used for 'four', confusion would likely have arisen between the ordinals 'fourth' and 'eighth' and other derivatives, making replacement of one cardinal necessary. (Avestan _as^ti-_ 'breadth of four fingers' is sometimes cited in this connection, but the meaning of the word is doubtful. Another theory holds that Proto-Kartvelian *oxto- 'four' was borrowed from PIE *h2ok^tó- when the singular was still in use, but I am not qualified to evaluate this.)

    Proto-Slavic *c^etýre 'four' requires a laryngeal. It cannot be in the position *kWetúXr-, for this would produce a long /u:/ in zero-grade forms like Skt. _catúrah._ masc. acc. 'four', _catúh._ 'four times', _turí:ya-_, _caturthá-_ 'fourth'. It must reflect *kWetXúr-, with the accent as in PSlv *býti 'to be' from zero-grade *bHXú-. Forms with /o/-grade, the nom. coll. *kWet(X)wó:r and nom. pl. *kWet(X)wóres, would have lost the laryngeal by de Saussure's Effect. Forms with /e/-grade would eventually lose the laryngeal reflex by paradigmatic analogy, and it was in hidden position before /u/ in most languages in zero grade.

    1. I don't agree that Proto-Slavic requires a laryngeal in the numeral 'four'. Baltic shows no long *ū and no tonal effects attributable to a laryngeal (cf. Lith. keturì). I have yet to see a case where a prevocalic laryngeal lengthens a vowel in Slavic and nowhere else. Laryngeal metathesis, to the extent that it can be documented, is not a Proto-Slavic process.

      There problem of "four" is quite complex and requires some space for a proper discussion, so how about the following idea: the next blog (to appear soon) will be an excursus on the Indo-European numeral 'four'. I'll return to "linguistic function" afterwards.

  12. To my knowledge, PIE-speakers did not make extensive use of square tiles, so I do not see how the sense 'four' could have developed out of 'corner(ed)'.

    I've found the passage in Bomhard (2007: 412f.). It merely quotes a paragraph from Burrow (1973: 259) almost without comment, and that very short quote doesn't mention the Slavic reflex, let alone the laryngeal it requires. Both Burrow and Bomhard's short comment, however, mention a Sanskrit word catvarám "quadrangular place, square, crossroads"; that would work without tiles.

    Bomhard also cites the Proto-Kartvelian "four" as *otxo- rather than *oxto-, but perhaps this single occurrence is just a typo. On the other hand, I'm surprised that PIE */kʲt/ would be borrowed as */χtʰ/, let alone as */tʰχ/, instead of an ejective plosive cluster */kʼtʼ/, and I'm equally surprised that the initial *h₂ wasn't borrowed at all when */χ/ was available! Something is wrong here.

    Allan R. Bomhard (2007): Reconstructing Proto-Nostratic – comparative phonology, morphology and vocabulary. Part Two: Morphology [part of volume 1, together with Part One]. Signum Desktop Publishing.
    Thomas Burrow (1973): The Sanskrit Language. 3rd edition. Faber & Faber.

    'intestinal worm, tapeworm' (i.e. 'segmented (worm), annelid'

    Tapeworms aren't annelids, they're flatworms. Their rear ends ( = most of the body) are segmented, though; they fall off and scatter like seeds, each containing gonads of both kinds.

    Proto-Celtic *kWetto- < PIE *kWet-nó-

    ...Wait. Is *-tn´- > *-tt- a regular sound change? (It's not mentioned in the Wikipedia article which otherwise looks quite detailed.) Because... that would immediately remind me of Kluge's law next door.


    Way above I wrote:

    /v/ behaves like an approximant all over the place:

    Or... like a "glide" or... I should have written "like /j/" in the first place: /l/ happily occurs at the ends of syllables all the time, and its long counterpart /lː/ is common.

  13. Bomhard also cites the Proto-Kartvelian "four" as *otxo- rather than *oxto-, but perhaps this single occurrence is just a typo.

    Bomhard is correct here; the usual PKart. reconstruction is *otxo (there are some more complex variants with parenthesises "optional" segments -- a practice that doesn't inspire much confidence).

  14. David: Thanks for the correction about tapeworms. My other flagrant error was typing PKtv *otxo- as *oxto-.

    The Celtic sound-change in question, parallel to Kluge's Law, was proposed by Stokes (KZ 29:375; IF 2:167-73) with several dozen examples in the second paper, not all of them convincing. One good example is PClt *menekki- 'frequent, abundant' from PIE *menegH-ní- (cf. *monogH- in Go. _manags_ 'many'; *mnogH- in OCS _mUnogU_ 'much, many'). Matasovic' (EDPC 265) suggests expressive gemination (so why not **meneggi-?) or assimilation *meneg-ki-, but considers the root-shape very un-IE anyway. The ablaut is very IE and the root is effectively disyllabic (cf. Lat. _sepelio:_ ~ Skt. _saparyáti_; Grk. _pélekus_ ~ Skt. _paras'ú-_). I think Stokes hit the nail on the head.

    Stokes does not analyze PClt *bekko- 'small' but it follows easily from PIE *bHeg-nó- 'broken (into pieces)'. Matasovic' (ib. 60) suggests assimilation from *bHeg-ko- but admits that such an adjective has no morphological basis in PIE. On the other hand /nó/-participles to /e/-grade roots with a single stop following /e/ do occur, e.g. Grk. _semnós_ 'revered' < PIE *tjegW-nó-.

    Going systematically through Matasovic''s protoforms containing geminated tenues, I find only one which REQUIRES non-Kluge-Stokes assimilation, *frikka: 'fart' < *prid-ka: < PIE *pr.d-káh2. Yet this analysis is supported by a SINGLE word, Modern Welsh _rhech_, making it highly dubious. I think today's Celticists should dig out Stokes' papers, filter his results through modern knowledge, and modify their theories of Celtic geminated tenues accordingly.

    I find the same thing in Italic, e.g. Lat. _pecco:_ 'I sin', Umb. _pesetom_ 'sin' from Proto-Itc. *pekka: or *pekkom 'blemish, fault' < PIE *(s)pek^-nó- 'seen, visible (mark)'; LL _ubuppa_ 'feeding bottle' through P-Italic from Proto-Itc. *ugW-uppa: 'wet woven container' (i.e. ceramic vessel for liquids formed over a woven bag) from PIE *ugWó- 'wet', *h2/3ubH-náh2 'woven object' (= PClt *uppa: reflected in Vannetais Breton _offen_ 'stone trough'); Lat. _glittus_ 'sticky' < PIE *glidH-nó- 'made sticky'; Lat. _siccus_ 'dry' < PIE *sikW-nó- 'poured out, emptied, drained' (= PClt *sikko- 'drained (from the air), condensed' > Ir. _sicc_, _siocc_ 'frost', one of Stokes' examples).

    Likewise Venetic *potto- (in Venetian _poto_ 'metal beaker for water') and Ligurian *potto- (in Provençal _pot_ 'pot'), from PIE *pod-nó- 'earthen vessel' (against simple *podó- reflected in Gmc. and Baltic) show KSA in these poorly attested Western IE daughters. Balto-Slavic has counter-examples and does not belong in WIE. The Kluge-Stokes assimilation presumably occurred in Proto-WIE (ca. 2000-1500 BCE?) and the geminated tenues were unaffected by Grimm's Law in Proto-Gmc.

    1. That is fascinating. If this is published in any place Wikipedia could cite, let me know! (I'm currently overhauling the article on Kluge's law, which was in a very bad shape... and only exists in English, not even in German).

      Two potential objections I had evaporated when I thought more about them. :-) Still, there's one thing I don't get: instead of assimilation, Latin usually has metathesis when a plosive is involved, and apparently there are Celtic examples where nothing happened: L fundus, cognate to Sanskrit budʰná and, following some confusion between nominative and genitive, to bottom; Old Irish domain and Welsh dwfn, cognate to deep (shortening behind diphthong) and (Kortlandt 1991) to Lithuanian dubùs "deep, hollow", where the lack of Winter's law indicates PIE *bʰ, not *b; L lambo to early Dutch lappen; L lingo to lick; L runco to rock; L rumpo to German rupfen; L tundo to German stutzen; Welsh sugnaf to suck; L stringo to Middle Dutch stricken "make fall"; L tango to Old English þaccian "pat", all taken from p. 43–44 of this book on Google Books.

    2. Another common outcome in Latin is the assimilation of the oral stop to the nasal: *swepno- > somnus, *petnā > penna, *deḱ-no- > dignus [-ŋn-], etc. Presumably, the intermediate stage here was *-bn-, *-dn-, *-gn-, with a voiced stop. In these examples the pre-Italic accentuation was at least possibly initial, but the same kind of assimilation took place in agnus, through Greek and (indirectly) Germanic both point to *h₂agʷ-nó- (in Germanic, the medial cluster yields *-wn- in the Vernerian context; the noun survives in a denominal verb, cf. Eng. yean).

    3. Oops, for "through" read "though".

    4. I gather from Pokorny (IEW 913) that We. _sugnaf_ would be **sunaf (= MBr _sunaff_) had it not crossed with _sug_ 'sap', borrowed from Lat. _su:cus_. The verbs are apparently denominal from native _sun_ 'sap', for which Pok. gives *seuk-n- as protoform; my guess is that the original accent was on the full-grade first syllable, so no Stokes assimilation.

      It is more difficult to explain PClt *dubno-, dubni- here. My guess is that inherited *duppo-, *duppi- were replaced by analogy with *dubnom 'bottom, ground', a primary neuter noun, thus *dHúb(H)-nom (> OCS _dUno_, Russ. _dno_). Balto-Slavic has some forms with, others without, Winter's lengthening, so some borrowing of geomorphic terms has probably occurred and *dHeubH- is not secure.

      For Italic, the closest thing I have to a publication is Cybalist #68402, corrected by #68416 (both Jan. 2012). I no longer support the last paragraph of #68402.

      The present-system nasal infix does not occupy the same position as the /n/ in the suffix *-nó-, so there is no discrepancy between (for example) Lat. _lingo:_ and Gmc. *likko:n, i.e. *li-n-g^H- and WIE *likkó- < *lig^H-nó-. Problems do arise with _fundus_, _unda_, and _pando:_, all of which have been cited as examples of Lat. metathesis of (pretonic) *-d(H)n- or *-tn- (Thurneysen, KZ 26:301-14; Buck, OUG §§81, 99.4, 213.2, 233; Sihler, NCG §222.2).

      I note first that Grk. _rukáne:_ has been borrowed as Lat. _runcina_ (Varro+; only *rucina in Romance, REW 7445), but (Dor.) _ma:khaná:_ as _ma:china_, and _kakházo:_ remodelled as _cachinno:_, not **manch-, **canch-. The verb _runco:_ must have acted folk-etymologically on *rucina in Roman Latin. Since liquids are poured from vessels by turning bottoms up, _fundo:_ could similarly have acted on a protoform with *fud-, or the choice of _fundus_ (= MIr _bond_, _bonn_ 'solea') over competing forms is likely due to _fundo:_. A protoform corresponding to Skt. _budhnáh._ would give OL *futtos 'bottom', possibly the base of _futtilis_ 'of lowest quality, practically useless', with _fu:tilis_ then a hypercorrect urbanism.

      Lat. _unda_ has nasalization like Lith. _vanduô_, dial. _unduô_, Old Pruss. _unds_, _wundan_ 'water'. This is probably after the /n/-infixed verb *un(e)d- 'to make wet' (Skt. 3sg. _unátti_, 3pl. _undáti_); one would expect Lat. *undo: (3rd cj. against denom. _unda:re_), and waves do make things wet.

      I take Lat. _pateo:_ and _pando:_ as related, but I do not derive the latter from *pat-no: as many scholars do. The relation is more like that between Lat. _lateo:_ and Grk. _lantháno:_. That is, while _pateo:_ is a stative built to the /tó/-participle of PIE *peh1- '(to be) open, exposed', _pando:_ is the nasal-infixed zero-grade present of the extended root *peh1-dH-. Further details are in #68402.

    5. I see the following outcomes in Latin, with examples:

      (1a) ´-Pn- > -mn-. Lat. _damnum_ = ON _tafn_ (nt. noun, no KL assim.)
      (1b) -Pn-´ > -pp-. Lat. _lippus_ < *h2libH-nós, cf. Grk. _aleípho:_
      (2a) ´-Tn- > -nn-. Lat. _annus_ = Gothic _aþna-_
      (2b) -Tn-´ > -tt-. Lat. _mitto:_ < *smid-né/ó-, cf. Eng. _smite_
      (3a) ´-Kn- > -n,n- (written -gn-). Lat. _lignum_ < *lég^-nom (nt. n.)
      (3b) -Kn-´ > -kk-. Lat. _muccus_ (> _mu:cus_) < *mug-nós, cf. _e:mungo:_
      (4a) ´-KWn- > -n,n- (-gn-). Lat. _agnus_ < *h2ágW-nos (see below)
      (4b) -KWn-´ > -kk-. Lat. _siccus_ < *sikW-nós, cf. Skt. _sincáti_

      I take _lippus_ as originally 'rubbed', hence 'sore' (of eyes which require rubbing) and (s)mitto: as 'I smite (with a missile), I send (a missile)'. That _muccus_ was original is shown by Pompeian EXMVCCAVIT and Italian _smocc(ol)are_ 'to snuff (a candlewick)'. It was evidently regarded as dialectal, like _Juppiter_, leading to hypercorrect urban _mu:cus_.

      Lat. _dignus_ and _magnus_ are primary adjectives, presumably oxytone, so one might expect **deccus and **maccus instead. Vocatives have recessive accent, however, and these adjectives were likely to be heavily used as such, 'O worthy one, O great one', with paradigms remodelled after the original vocatives.

      The only basis for taking 'lamb' as oxytone is Grk. _amnós_, which resembles a substantivized participle. In literary Attic-Ionic, _amnós_ is primarily used as the suppletive nom. sg. of _aré:n_ (Cretan _ware:n_). Other cases are all in common literary use, but not _aré:n_. This word must have had three stems in PIE: strong *wr.h1é:n-, middle *wr.h1én-, weak *wr.h1n-´. The weak stem displaced the middle stem and then ousted the strong stem from all cases but the nom. sg., leading to a situation parallel to Latin _caro:_ 'flesh', acc. sg. _carnem_, etc. Apparently _aré:n_ was morphologically unpalatable to most authors and replaced by _amnós_. But this word was not originally synonymous with _aré:n_; it denoted in particular an older animal suitable for sacrifice (cf. Liddell-Scott rev. suppl. s.v. _amnós_, 1996). Thus the accent which has come down to us could have been influenced on the one hand by the gen. sg. _arnós_ in the suppleted paradigm, or indeed by the Late Greek nom. sg. _arnós_ (Aesop's Fables), which was likely in demotic use much earlier, and on the other by _semnós_ 'solemn, holy'. This makes it plausible that _agnus_ continues a primary PIE noun *h2ágW-no-, and the original Greek nom. sg. was *ámnos. A similar Greek accent-shift took place in _agrós_ 'field' against Skt. _ájrah._ 'plain'.

      Neither example commonly cited to show Gmc. *-wn- in Verner's position is compelling. In fact we have OHG _(h)nicchen_ 'to nod (repeatedly)' = NHG _nicken_ against _(h)nîgan_ 'to lean, bow' = _neigen_, continuing PIE *knigWH-né/ó- against *knéigWH-, Gmc. *hnikk- and *hneigW-. I suspect that Gmc. *seuni- 'sight' was not inherited from PIE *sekW-ní-, but that the inherited *sekkí- was replaced by *sexW-ní- through analogy with more transparent deverbal nouns, became *segwní- after Verner's Law, and regularly lost *g here (cf. Seebold, KZ 81:132). Gmc. *hnikk- was not replaced because many verbs had Kluge doublets of similar form.

      The expected PGmc *ak(W)na- 'lamb' from *h2ágW-no- could easily have been contaminated by *awi- 'sheep', yielding *auna- 'lamb' and *auno:n 'to yean'. De Vaan (EDL s.v. _agnus_) suggests as much.

      Thus no good labiovelar evidence distinguishes Kluge's Law from Stokes' Law in Celtic, or from its counterpart in Italic. It makes sense to backdate KL to the Old Western IE stage, and assume that the resulting geminated tenues passed unscathed through Grimm's Law.

    6. The only basis for taking 'lamb' as oxytone is Grk. _amnós_, which resembles a substantivized participle.

      That the 'lamb' word was barytone is also indirectly supported by the acute accentuation of its Proto-Slavic reflex *a̋gnę.

    7. I realise I may be opening a can of worms, but wouldn't a pretonic vowel lengthened and acuted by Winter's Law have attracted the stress anyway, even in an original oxytone?

    8. Not sure, but would a Winter-lengthened vowel really trigger Hirt's Law? Or does it require original -VHC-?

    9. Consensus has it that Winter's Law is later than Hirt's Law. What I was thinking of, however, was the Proto-Slavic process that pulled the stress back to an acuted root syllable, e.g. *āblukó- > *(j)a̋blъko. But then of course a diminutive formation like *āgnikó- (> *(j)a̋gnьcь) would have had final stress no matter what the original accentuation of the PIE stem (not preserved in Slavic without a suffix).

  15. Oh, wait:

    so why not **meneggi-?

    You need to assume devoicing of all long plosives anyway: assimilation to [n] is more likely to produce something voiced than something voiceless, and in some of your examples – including this one – the plosive was voiced to begin with. Of course, this extra step of devoicing has to be expected: long voiced plosives are hard to pronounce, so you're not going to bother keeping them voiced unless you also have long voiceless plosives to distinguish them from (as Italian does today).

    I think "expressive gemination" is nonsense, though. I have trouble imagining that a paralinguistic effect would introduce a series of wholly new phonemes, and I have trouble imagining that it would apply to a word that has as little emotion attached to it as "much". The Neogrammarians may have overdone a few things, but going back* to unfettered Romanticism doesn't strike me as a good idea.

    * For Germanic, expressive gemination was first proposed in 1869.

    1. I wouldn't rule out "expressive gemination" in Germanic as a secondary phenomenon. You get the first round of geminates (voiceless stops as well as liquids) from the complex of changes known collectively as Kluge's Law, i.e. via regular sound change. But Germanic hypocoristic words were often weak nouns with a nasal suffix that triggered Kluge's Law in some inflected forms (for details, see Guus Kroonen's book). This may have led to associating gemination with emotive value; then gemination (not restricted to voiceless stops and liquids) became co-opted as a kind of derivational device. Hence numerous diminutive names like Old English Offa or Old Norse Siggi, and diminutive common nouns like rudduc, readda 'robin redbreast'.

    2. My parenthetical question about **meneggi- was directed rhetorically at Matasovic''s suggestion of expressive gemination for this lexeme within Celtic. Attempting to rationalize Kluge's Law is not likely to produce a generally acceptable outcome. The best we can do is try to describe it accurately and establish its chronology.

      Hypocoristic gemination in Gmc. weak (nick)names is undeniable, and it makes sense that hypocorisms occasionally cross the line from proper names to appellatives. This is probably the source of most old Gmc. geminated mediae. I agree with Piotr that OE _docga_ 'dog' originated as a hypocorism, although I disagree about what the original root was.

      A likely Celtic example is *gobbo- 'muzzle, snout, beak' (EDPC 164) which could easily have been extracted from a hypocoristic *Gobbo:n 'Beaky, Bignose' vel sim. If the rules for forming hypocoristics were broadly parallel to those in Germanic, several root-shapes for the original appellative are possible.

      But in general, I do not like "expressive gemination" as an etymological device. That is why I prefer KSA to explain, for example, Lat. _siccus_. I see no reason why dryness should require more "expressivity" than wetness. In fact, I would expect the reverse.

    3. I wouldn't rule out "expressive gemination" in Germanic as a secondary phenomenon.

      Oh, I agree: once phonemic consonant length exists, you can use it.

  16. Several of my comments above require revision. I’ll begin with the Germanic ‘brother-in-law’ word, for which I proposed a compromise-form to get around the phonetic difficulties (17 & 20 Aug. 2014). It’s more straightforward to posit a specific laryngeal (residue) metathesis in Proto-Gmc. followed by Cowgill’s Law. In this scenario PIE *deh₂j-wer- (weak stem *-ur-) was colored to *dah₂jwer-, -ur-. A laryngeal (residue) adjacent to a cluster of both semivowels was metathesized to intersemivocalic position, so the strong stem became *dajh₂wer-. The weak stem *dah₂jur- was analogically altered to *dajh₂ur-, since most utterances of ‘brother-in-law’ were in the nom. or acc. case. Cowgill’s Law operated on the strong stem, yielding *dajgwer-; again the weak stem followed suit to *dajgur-. Grimm’s Law operated, and a thematic noun *tajkura- was built on the old weak stem *tajkur-. (Alternatively the thematic noun was created before Grimm’s Law.) Simple epenthesis of *k (or pre-GL *g) between the semivowels is unacceptable, because it didn’t happen with *newja- ‘new’ = Ved. návya-, etc.

    1. Simple epenthesis isn't the claim of Seebold's rule, which was presented above 3 years ago.