06 September 2015

A Jan’s Chance: The Fate of Innovations

Imagine that you start a linguistic innovation. One fine day you decide to replace the English word dog with a new, hitherto unused word — for example, jan. As of now, you will say, “I have to walk the jan”, “My jan’s name is Bruno”, and, “The jan is man’s best friend”. You will substitute jan for dog in set phrases such as “go to the jans” and “every jan has its day”. Jan would do its job neither better nor worse than dog. Both are arbitrary sound sequences (their pronunciation does not suggest what they mean); both are short and easily pronounceable. Dog has only one obvious advantage over jan: it is already an established, familiar, commonly used English word. There is no compelling reason why people should find it a good idea to abandon it just like that and learn to use a different word for the same concept. If you are really determined (and perhaps slightly nuts), you can try persuading your family and close friends to humour you and adopt your innovation when they are talking to you. You can bring up your children informing them that your family pet Bruno is a jan. But sooner or later they will find out that everybody else calls jans (including Bruno) dogs. Your experiment will almost certainly fail. Not because the word jan is useless, but because the function you’d like it to have is already carried out equally well by another word. It makes jan a “neutral” innovation — one that could play its role well enough but has no functional advantage over a preexisting competitor.

On the other hand, something similar to this thought-experiment really happened about one thousand years ago. The word docga (the Old English ancestor of dog), coined by an unknown innovator at an unknown date*), somehow became a widespread synonym of the established Old English word hund, and after a few centuries managed to replace it in the mental lexicon of every English-speaker of the time. Although its dethroned predecessor did not become completely obsolete, its frequency of use dropped by at least an order of magnitude, and it had to undergo narrow semantic specialisation in order to survive. Today, a hound is a special type of hunting dog, not just any dog in general. And if you look at other languages, you will occasionally see similar cases of lexical replacement. French chien and Italian cane go back to Latin canis, as expected, but Spanish perro is an innovation (about as mysterious as dog). It seems some new words for old things do catch on, albeit rarely. The chances are slim but apparently larger than zero.

A selfie with a jan (whose name is not Bruno)
A lexical innovation is more likely to succeed if it finds and conquers a functional niche not yet occupied by any other word. In this way it makes itself useful, which may give people a powerful incentive to adopt it. For example, the word selfie made its first recorded appearance in September 2002, in Australia (or rather in the Australian sector of cyberspace). Within the next few years it grew popular among (mostly young) English-speaking Internet users worldwide, slowly gaining the status of buzzword. Then it infected Facebook communities and its popularity soared to the zenith (as did the number of selfies published online). In 2013 the Oxford English Dictionary declared it the word of the year.

How is it possible for an innovation to become “fixed” in a large speech community? How do the the chances of fixation depend on the functional value of the innovation? What is that functional value? What happens to innovations that have enjoyed some success  but haven’t yet reached fixation? This is what my next blog posts will be about.

*) Nobody knows for sure where Old English docga came from. My own modest etymological proposal can be found here.


  1. Anne L. Klinck identified a possible OE verb derived from docga* in the elegy of Wulf and Eadwacer (The OE Elegies 171-2). Line 9 of the elegy reads:
    Wulfes ic mines widlastum wenum dogode.
    Some scholars have corrected _dogode_ to _hogode_ 'thought', but Klinck argued that _hogode_ would be unlikely to take a genitive or dative object. She suggested instead that it is the preterit of a Class II weak verb *dog(g)ian 'to follow (like a dog)', taking a dative object like _fyl(g)ian_ 'to follow'. Not all scribes were fastidious about writing -cg- or -gg- for /gg/. Klinck took the dat. pl. _widlastum_ 'wide tracks, far journeys' as the object of _dogode_, and the dat. pl. _wenum_ as adverbial, 'with hopes, in hopes'. Thus she rendered the line 'I followed the far journeys of my Wulf in (my) hopes.' Further on (p. 244), noting several other unusual words and stylistic peculiarities in this elegy, she concluded "... it is very probable that _Wulf and Eadwacer_ comes from a popular background, rather than the aristocratic tradition to which most Old English poetry belongs." Thus 'dog' as both noun and verb may have been well established in lower-class OE speech long before 1050. Most canines owned by poor folks were likely to be docgan* rather than _hundas_, and the former term would not be derogatory to these speakers. Nor would a derived verb *doggian (possibly going back to Anglo-Frisian *doggōjan, or even PGmc *doggōnaN) carry any negative baggage with lower-class speakers. It would signify 'to follow faithfully'.

    I agree that 'dog' probably began life as a hypocoristic, but I find it hard to believe that OE _dox_ could have produced docga* 'Darkie' this way. None of the case-forms of _dox_ contained the voiced dorsal /g/, and the fact that the /k/ in _dox_ /doks/ represented an underlying /h/ whose opposition with /g/ was neutralized in such clusters would have been irrelevant to ordinary speakers. The cited examples of hypocoristic truncation, OE _Totta_ for _Torhthelm_ and _Beoffa_ for _Beornfriþ_, rather lead me to think that _dox_ would have led to *Docca or *Dossa for a dark or dark-faced dog. OE _frogga_ and _frox_ 'frog' do not provide a convincing parallel since Guus Kroonen can get most of the 'frog' words by paradigm-splitting and length-contamination from a PGmc n-stem *frugōN, gen. *frukkaz; somewhat similarly OE *fogge and _fox_ (EDPG 156-8; PGmc n-stems 23, 69-78).

    The fact that OE docga* referred to a sturdy, stout, inelegant type of canine suggests that a term indicating such features could serve as a protoform. One such is Shetl. _dorg_ 'stout person', with Icel. _durgur_ 'crude man' and _dyrgja_ 'coarse stout woman'. PGmc *durga- would give *duggan- as hypocoristic (OE *doggan- with /a/-umlaut), just as ON names in Berg- give _Beggi_. Kroonen has only PGmc *durgō- 'fishing-line' (EDPG 110). A fishing-line wound up on a stick forms a stout clump. This sense-drift is also shown by Norw. _droll_ 'big stout person' lit. 'something wound up'; cf. MD _drol_, _dral_ 'coarse thread; short stout person'; ON _drymba_ 'coarse article of clothing' against Icel. _drymbi_, _drumbur_ 'chump of wood', indeed Faer. _drymbingur_ 'big dog'. I suspect that *duggan- 'coarse stout canine' was already in use in NWGmc, but the fem. form in the sense 'low-class bitch' became a derogatory term 'coward' in NGmc (ON _dugga_, OSwed _dugge_) driving the masc. out of use. Then again, other derivations are possible, and I may be barking up the wrong tree.

  2. Thanks, Douglas! With fox ~ *fogga (-e?) the problem is that the attested nasal stem (Goth. fauho etc.) is f. *fuxōn- (no Vernerian voicing). Of course a related *uxsin-type masculine like *fuɣin-/fukk- (→ OE *fuggan- by contamination) is not impossible, but since there is no other evidence for it, the whole scenario looks rather convoluted and speculative.


    1. Kroonen does not explicitly address *fuxōn- against *fuggōn-, but he provides a template for such an alternation (PGmc n-stems 71). His model paradigm of nom. laþō, gen. lattaz, dat. ladeni yields three new paradigms by generalizing þ, tt, or d and maintaining length. The third new paradigm has nom. ladō, gen. laddaz, dat. ladeni. In principle the gen. could generate a whole new paradigm with dd throughout. This apparently is his mechanism for getting *fuggōn- out of *fuxōn- which he calls "the feminine to *fuxsa-", derived from PIE *puḱ- 'tail' vel sim. (EDPG 157-8). Obviously however *fuxōn- cannot be a late creation from *fuxsa- or there would be no alternation.
      With no hint of full grade in the root, it seems best to posit a hysterokinetic 'ox'-type animate, PIE nom. *puḱ-ḗn, gen. *puḱ-nós '(prominently) tailed animal' vel sim., beside a thematic by-form *puḱ-só- which took over masculine usage. This would lead (after applicable PGmc morphological shifts) to a default feminine nom. *fugō, gen. *fukkaz, dat. *fugeni. Kluge's paradigmatic-contamination mechanism could produce the alternate stems *fukkōn-, *fukōn-, and *fuggōn-, the latter preserved by chance in OE fogge. But the only way to get *fuxōn- is out of the barytone vocative, PIE *púḱ-en. This is not as crazy as it might sound. Mark Twain portrays Tom Sawyer as practicing divination by doodlebug: "Doodlebug, doodlebug, tell me what I want to know!" A similar formula 'Vixen, vixen ...' might have been repeated by those who thought they could foretell the future by reading her movements, and the whole paradigm remodelled after the vocative.
      Now while Kroonen does a good job defending Kluge's theory against Kauffmann and other critics, he goes off the deep end positing Finnish-style consonant-gradation for PGmc. No attested Gmc. lg. has any such thing. What we have are stem-alternants arising from Kluge's Law and subsequent contamination carried through the whole paradigm in a given dialect, not gradation within a synchronic paradigm. Also I cannot believe that hypocoristic gemination is the direct or indirect result of Kluge's Law. The latter never produces anything like Tubbi for Þorbjorn or whatever. Kroonen fails to distinguish between hypocorisms and the results of KL with contamination. While this may not always be possible, it is clear that the four stem-variants of 'knave' had to arise by Kluge's mechanism, while *krabban- 'crab' is almost certainly a hypocoristic of *krabita- 'crab' (or its preform *kraba- 'hard(ened)' if from PIE *grobʱós; possibly ~ Grk. γραψαῖος 'crab' < *γραψή 'suit of armor, exoskeleton' < *gr̥bʱ-tjéh₂ 'set of hard parts'?). Likewise *maþþōn- 'moth' against *maþan- 'maggot' and *kazzan- 'male ptarmigan' against *kazan- 'id.' (n-stems 70) are probably hypocoristics. They do not show the variety of stem-consonantism of 'knave'. Indeed most of the n-stems with long fricatives cited by Kroonen appear to be hypocoristics.

    2. I'm sure the Germanic word for 'crab' is a substrate loanword cognate to Semitic *ʕa-kˀrab- 'scorpion'. Other words with Semitic correspondences are more widespread within IE, as e.g. 'plough' and the numeral '7', whose Germanic form corresponds to a Semitic masculine instead of the femenine found elsewhere.

      As this corpus contains agricultural lexicon, it can be adscribed to the Neolithic farmers who came to Europe from the Near East.

    3. But the only way to get *fuxōn- is out of the barytone vocative, PIE *púḱ-en

      Why? *púḱ-en would be the vocative of puḱ-ḗn, but all stems of this type are masculine in Germanic. The *-ōn and *-īn of weak feminines are of hybrid origin: they are plain *-ah₂ or -ih₂ feminines secondarily transformed into nasal stems. So *fuxōn may be either *púḱah₂ + *-n or pre-Gmc. amphikinetic *púḱon-/*puḱn- reinterpreted as a Germanic *-ōn feminine. There are several posssible pathways producing an accented nil grade in such paradigms.

    4. Also I cannot believe that hypocoristic gemination is the direct or indirect result of Kluge's Law. The latter never produces anything like Tubbi for Þorbjorn or whatever.

      It certainly can be an indirect result of Kluge's Law. KL introduced phonemic consonant length in the first place, and it produced long consonants in n-stem nicknames; eventually, other baby-language nicknames were interpreted as n-stem nicknames even if their long consonants weren't /pː tː kː/.

      What we have are stem-alternants arising from Kluge's Law and subsequent contamination carried through the whole paradigm in a given dialect, not gradation within a synchronic paradigm.

      Kroonen just postulates somewhat more systematic contamination than you do. Doesn't he?

      I'm sure the Germanic word for 'crab' is a substrate loanword cognate to Semitic *ʕa-kˀrab- 'scorpion'. [...] As this corpus contains agricultural lexicon, it can be adscribed to the Neolithic farmers who came to Europe from the Near East.

      But "7" can't possibly be a loan from a substrate. (And I'd really doubt that for "plough" as well.)

    5. Certainly, '7' and 'plough' can be classified as Wanderwörter, but other items don't fall into this category, so in my opinion they're evidence some of the languages spoken by European Neolithic farmers were related to Semitic.

    6. Back to this:

      I'm sure the Germanic word for 'crab' is a substrate loanword cognate to Semitic *ʕa-kˀrab- 'scorpion'.

      If that's a substrate word from Semitic-speaking Early European Farmers, you either have to be a glottalicist, or you'll run into a problem with Grimm's law. Two problems actually.

    7. While strictly speaking I'm not a glottalicist, my own interpretation of Grimm's Law is an unortodox one, because I see it as an isomorphism.

  3. So a dog is a fuzzball? ^_^

    Goth. fauho etc.

    Because Gothic is sometimes weird about this, I'll mention some of the "etc.": German (hunters' jargon) Fähe "vixen".

    1. If it comes from OHG foha, voha, MHG vohe, the vowel is irregular (contamination? but with what?).

    2. Hm.

      I can only offer some obscurum per quoque obscurum. Instability between ō and ā which I can't explain, and for which I've never seen any explanation offered (for what that's worth), has happened elsewhere in German: OHG māno, Cimbrian /mano/ (no idea of length), other German Mond*; OHG thō > , MHG , modern da; OHG (IIRC), modern Laa in placenames, dialectal /lɒx/-, /lɒg/-, where /ɒ/ is the otherwise exclusive reflex of /a/ (vowel length lost across the board).

      * With an excrescent -d that additionally marks the end of the word; since Early New High German; also found in niemand "nobody".

    3. Should perhaps add that Mond is pronounced with /oː/ even though the spelling very strongly suggests otherwise.

    4. Many German dialects during the "Middle" period had a retracted and rounded pronunciation of historical ā (which, in the case on Mond and Monat goes back to *ē. Some of these words were adopted into the mainstream "educated" pronumciation with /oː/ rather than /aː/. In several (though not all) cases the vowel is followed by a nasal (Mohn, ohne, Brombeere and a few others, but also Odem, Docht), which makes this development partly convergent with the Anglo-Frisian rounding of nasalised *ā (Eng. moon, month, broom). The rounded vowel in Mond, Monat began to pop up in the written standard in the 16th century.

      NHG da has resulted from the confusion of spatial and temporal pronouns in MHG due to the same dialectal rounding (see also wo from older ). The originally different words and became reinterpreted as competing variants of the same pronoun, and da eventually won. Not sure about the last one -- I'd have to check it up somewhere.

      Anyway, the 'vixen' word is different. OHG, MHG, MLG, ON and Gothic all agree that the protoform was **fux-ōn-.

    5. P.S. Ignore the accidentally duplicated asterisk. It's just an ordinary reconstruction, not a pre-pre-proto thing.

  4. That makes a lot of sense, of course, but it means the rounding (and perhaps the nasalization) found today in the Bavarian-Austrian dialects must be surprisingly young. My dialect has the /o/ reflex in every one of the words you mention*, except da, which has the /a/ reflex in both spatial and temporal meanings. It also has the /a/ reflex in Abend (let's see how well this displays: [ˈɒ̈m̩d̥] ~ [ˈɒ̈̃m̩d̥]; lenis plosives are regularly dropped in front of syllabic nasals), for which an isogloss between /ˈoːbig/ and /ˈɑːbig/ runs through Switzerland.

    * Except that of course it lacks the word Odem, a literary form I've probably only encountered in the context of pre-20th-century versions of Genesis; with its unshifted d it should be far, far northern in origin.

    1. The retraction of nasalised [a] (often accompanied by lip-rounding) is commonplace. It happened in Polish a few centuries ago (when the vowel was lengthened; otherwise it shifted in the opposite direction, giving rise to the modern distribution of ą and ę); it is happening in most European varieties of French, and it has happened several times in Germanic, at different times and in different lineages.

    2. I forgot to mention two things:

      1) If Fähe entered Standard German from an unrounding dialect, which is most of them, then all we need to explain the vowel is a suffix or an analogy that can trigger umlaut.

      2) Laa is probably the same as a place name element from elsewhere, Lohe.

      it is happening in most European varieties of French

      That process is long completed. Although the dictionaries keep pretending otherwise, I've never encountered unrounded [ɑ̃] this side of Canada. The only unrounded versions you can find on the French mainland are [aŋ] in stronger southern accents and [an] in generic African accents.

    3. I've found the etymology of the Laa/Lohe thing! Ringe (2006: 89):

      PIE *lówkos ‘clearing’ (cf. Lith. laũkas ‘field’, Lat. lūcus ‘grove’) > PGmc *lauhaz (cf. OE lēah ‘meadow’, OHG lōh ‘copse, grove’);

      It makes no sense to me that the reflex of /a(ː)/ is instead found in what is now Vienna.

    4. Folowing Villar, I think the ablaut vowel *o is a recent development in some IE branches, so it can't be reconstructed for earlier stages. This also explains why we find words with non-ablauting *a (or rather ) such as e.g. *ɑkwā 'river' or *abVl- 'apple' in contexts where laryngeal-colouring is out of the question.

    5. Interesting choices of examples. The first is limited, as we've seen and discussed on this very blog, to Italic, Germanic and maybe Celtic; the second, limited to Celtic and maybe Germanic (the Germanic version could be a Celtic loan), has been proposed to be a reflex of *mel- where *ml became *bl.

      Anyway, the ablaut vowel *o has nothing to do with foxes. Foxes are zero-grade, with *u.

  5. Speaking of linguistic innovations - an innovation is taking place in Swedish right now. A new third person pronoun has been introduced: "hen". "Hen" is a gender-neutral pronoun referring to people. It has a "functional advantage" since it is meant to replace phrases like "he or she" when the speaker doesn't know (or doesn't need/want to indicate) the gender of the person being mentioned. It is meant to look similar to "han" (he) and "hon" (she), perhaps influenced by the Finnish 3rd person singular pronoun "hän" - Finnish does not have grammatical gender; "hän" means both "he" and "she".

    Will "hen" catch on? Nobody knows yet. The word is controversial - some use it, others wouldn't touch it. I think "hen" has become sort of a shibboleth, separating two kinds of people - by using it (or refusing it) you show what kind of person you are. "Hen" is associated with people on the progressive political left, feminists and gay/queer rights activists.