Language Evolution: Two Is Company, Four Is a Party

07 October 2014

Two Is Company, Four Is a Party

Neuter nouns with the suffix *-wr̥/*-w(e)n- are relatively rare in most branches of Indo-European. The only group where they can be found in great numbers is Anatolian. In Hittite, the suffix productively formed verbal nouns (names of actions), but there are also examples of nouns that had become independent lexical units, no longer bound to a particular verb paradigm. They had usually acquired a concrete meaning (referring to a thing or substance rather than an abstraction). One of such nouns is Hitt. pahhur/pahhuen- ‘fire’, evidently an ancient word, preserved in many branches of the family and showing evidence of archaic vowel alternations and mobile stress: nom/acc.sg. *páh₂wr̥, gen.sg. *ph₂wéns, etc. It may be etymologically connected with the verb *pah₂- ‘guard, protect’, but it’s doubtful if even the speakers of Hittite were still aware of any such connection: the semantic distance between the verb and its derivative was already too great.

Outside Anatolian, the suffix does not play any major role. The nouns that contain it are scattered remnants of a Proto-Indo-European pattern of word-formation. Their attestation is very uneven. They are quite well represented in Sanskrit and Greek, but only isolated examples are found elsewhere (the ‘fire’ word, which became part of Indo-European basic vocabulary sufficiently early, is exceptionally well attested). Here are a few typical *-wr̥/*-w(e)n- nouns evidently connected with known verb roots:

*h₂árh₃-wr̥, gen. *h₂r̥h₃-wén-s ‘arable land’ (root *h₂arh₃- ‘till, plough’);
*snéh₁-wr̥, gen. *sn̥h₁-wén-s ‘string, sinew’ (root *(s)neh₁- ‘spin, twist’);
*séǵʰ-wr̥, gen. *sǵʰ-wén-s ‘steadfastness’ (root *seǵʰ- ‘conquer, take possession of; hold, own’);
*h₁éd-wr̥, gen. *h₁d-wén-s ‘food’ (root *h₁ed- ‘eat’).

Their reflexes in the historically documented languages rarely display the whole range of vowel, consonant and stress variations, most of which were levelled out analogically in prehistoric times. Still, these alternations are reconstructible thanks to the fact that different fragments of the pattern have been preserved in different languages. They can be reassembled into a complete picture like the pieces of a jigsaw puzzle or the disarticulated skeleton of a fossil animal.

Got wheels?
A four-wheeled toy from the Cucuteni-Trypillian culture;
the early fourth millennium BC.

Neuters of this kind formed collectives by inserting a lengthened *ō into the suffix. The collective of a count noun denotes simply a set of objects (a collective plural), while the collective of a mass noun like ‘fire’ denotes a particular quantity or sample of the thing in question (‘a fire, a burning mass’). This became one of the derivational mechanisms by which Indo-European mass nouns could be transformed into count nouns. The accent was commonly shifted to the suffix in the process, causing the reduction of the root vowel: *páh₂wōr (collective) > *ph₂wṓr > *pwṓr (a countable neuter with its own case forms such as gen.sg. *p(h₂)un-és). Still later, the distiction between the original mass noun and its collective could be blurred and abandoned, the younger form ousting the older and serving in both functions (‘fire’ or ‘a fire’). The archaic Proto-Indo-European form *páh₂wr̥ is unambiguously preserved only in Anatolian, while the remaining Indo-European languages show reflexes of *pwṓr or its further modified descendants.

Now we can view the reconstruction *kʷét-wr̥ in this light. Supposing it was derived from our hypothetical verb root *kʷet- ‘group into pairs’, the original meaning of *kʷétwr̥ (as a nomen actionis) would be something like ‘pairing’, and its collective *kʷétwōr would mean ‘a particular result of pairing, a complete set organised into pairs’. In the Proto-Indo-European world, there were many “natural” sets of things conceptualised as consisting of two pairs: human hands and feet; fore and rear legs of animals; the wheels of a wagon; the four directions, whether cardinal (east and west, north and south) or relative (forward and backwards, left and right); paired organs of perception (two eyes and two ears). This could have provided sufficient motivation for treating ‘4’ as the prototypical case of an “even collective”. An interesting parallel can be seen in the “fraternal” numeral systems widespread in Amazonia. In the languages that employ them, the numeral ‘4’ is derived from an expression meaning ‘each has a brother/companion/spouse’. At a more primitive stage, preserved in the Dâw language, there are only three “exact” lexical numerals, ‘1’, ‘2’, and ‘3’. The values from 4 to 10 are described as ‘even’ (‘has a brother’) or ‘odd’ (‘has no brother’). The precise value can’t be expressed linguistically, but the words ‘even’ and ‘odd’ can be supplemented by clarifying hand gestures:

Dâw speakers indicate ‘four’ by holding the fingers of one hand separated into two blocks; for ‘five’, they add the thumb; for ‘six’, they place the second thumb against the first to make a third pair; and so on until for ‘ten’ all fingers are grouped into five pairs, the thumbs together.

[Epps 2006: 265]

Once established as a concrete numeral (rather than part of an even-odd tally system), *kʷétwōr (or *kʷətwṓr) was interpreted as an ordinary neuter plural, and – like the numerals ‘1’, ‘2’, and ‘3’ – formally an adjective, inflected not only for case but also for gender. This resulted in the analogical creation of the animate plural in *-wor-es (and the periphrastic feminine ‘four females’, soon univerbated and phonetically mutilated in the process). Note that if the adjective had been formed directly from the verbal noun *kʷétwr̥/*kʷ(ə)twén-, its animate plural would probably have ended up as *kʷet-won-es. In addition to the Greek and Vedic words for ‘fat’, already discussed, compare Greek peîrar (gen. -atos) ‘boundary’ < *pér-wr̥/*pr̥-w(e)n- versus the Homeric adjective a-peírōn (animate) ‘boundless, endless’ < *n̥-per-wōn.

All this suggests that the word *kʷétwr̥ (coll. *kʷétwōr) was transparently derived from a verb root and adopted as a cardinal numeral at a rather late date, perhaps in “Core Indo-European” (the non-Anatolian part of the family) rather than in Proto-Indo-European proper. It is a well-known fact that Anatolian has a different word for ‘4’, *meju- (Hittite meu-/meyau-, Luwian māwa-). Since the jury is still out on whether Hittite kutruwa(n)- ‘witness’ has anything to do with the numeral ‘4’*), we should seriously consider the possibility that the familiar reconstruction *kʷetwores is not Proto-Indo-European at all but represents a “dialectal” innovation which replaced its older synonym in the common ancestor of Tocharian and the extant branches of the family.

If this were a journal article rather than a blog post, I would now be obliged to account for every puzzling irregularity in the branch-specific reflexes of *kʷetwores and its variants. I will spare my visitors such excruciating details, but if anyone is really interested in discussing them, welcome to the Comments section.

And now back to other matters – next time.

*) A witness in court could be denoted as ‘the fourth man’ (beside the two contracting parties and the judge).

Reference

Epps, Patience. 2006. “Growing a numeral system: The historical development of numerals in an Amazonian language family”. Diachronica 23(2): 259-288. [a preprint version is available here]

[back to the table of contents]

37 comments:

Piotr Gąsiorowski10 October 2014 at 18:55
[I am going to address some of the points made by Douglas G. Kilday here. They are more closely connected with the proposals made above, so I have decided to redirect the discussion to the Comments space of this post. First, the curious vocalism of the Slavic cardinal numeral '4'.]

Douglas: I regret the early closure of this thread. I was eagerly awaiting the non-laryngeal explanation of the acute in Proto-Slavic *c^etýre.

One important type of non-laryngeal acute is that found in Balto-Slavic vr̥ddhied derivatives. In roots ending in VR a morphologically lengthened vowel gets acuted before a vowel-initial suffix. I find the evidence paraded by Miguel Carrasquer Vidal convincing, and his phonological explanation vastly preferable to other solutions. It explains why we find the acute in cases like *sla̋va ‘fame, glory’, where the root is evidently aniṭ (laryngealless), cf. *sloves- ‘word’ < *ḱlewes-. The question is only why the vowel in the Slavic word for ‘4’ became lengthened: such a lengthening is conspicuously absent from Baltic *ketur-. Let’s first observe that Baltic has only one relict of the original consonantal stem: the accusative (masculine), Lith. keturì < *keturins < *kʷetur-m̥s. No nominative forms survive in any Baltic language. In Slavic, the same accusative (but with a mysteriously lengthened and acuted vowel) is the source of the nom./acc. (f./n.) *četyri, but in addition we have the nom. (m.) *četyre for expected *četvore < *kʷetwores (the strong form of the stem).

Neither Baltic nor Slavic has any direct reflexes of the old neuter *kʷetwōr. Let’s try to “reconstruct forward” the hypothetical Slavic outcome. Scanty as the evidence is, one would expect the same sonorant loss, circumflex accent and vowel raising as in the types represented by OCS kamy ‘stone’ and mati ‘mother’ (nasal and rhotic stems, respectively):

*kʷetwōr > *ketwō̃ > *ketwū > *ketū > PSl. *čety

This would have led to excessive variation in the paradigm of ‘4’: *četvor-e/*četur-ī/*čety, had the development been regular. My suggestion is that already in pre-Proto-Slavic the short *u of the weak cases was lengthened on the analogy of the neuter form; eventually the nom.pl. masc. adopted the same vocalism. A similar substitution took place in a few parallel cases (I’ll keep the examples to myself for the time being – it’s something I intend to publish). This lengthened medial *ū acquired an acute intonation as predicted by Miguel. By Late Proto-Slavic times this vowel was generalised in all the forms of the cardinal number, and the old neuter form had been replaced by *četyri.
ReplyDelete
Replies
Piotr Gąsiorowski10 October 2014 at 18:59
P.S. I forgot the reference:

Carrasquer Vidal, Miguel. 2013. "Balto-Slavic long vowels". Baltistica 48 (2): 205-217.
ReplyDelete
Replies
Piotr Gąsiorowski10 October 2014 at 20:08
Douglas: De Saussure's Effect would have deleted *h1 from /o/-grade, *kWeth1-wor- > syllabic *kWet-h1wor- > *kWet-wor-.

The *o appearing in posttonic suffixes has a different origin from apophonic *o in roots, and may not behave in quite the same way with respect to the Saussure effect. Anyway, as shown by Andrew Byrd, the Saussure effect operates only in sonorant environments. Word-medially, it happens between a liquid and a nasal: H > ∅/oR_N.

Douglas: Original heteroclitic morphology of 'four' is very unlikely due to the complete absence of /n/-forms. No such absence is found with 'fire' and 'water' even though no non-Anatolian language preserves the original declension.

But 'fire' and 'water' are part of basic vocabulary. They are used orders of magnitude more frequently than words with meanings like 'pairing', and are much more resistant to lexical replacement. Despite of that, the n-forms of 'fire' survived only marginally, and only thanks to the fact that the original locative in *-wen(i) was not completely abandoned (the best witness is Germanic). Note that even Greek, which has retained many heteroclitic stems, has only -r- in 'fire'.

And in the case of '4' we are not really talking about the heteroclitic noun *kʷét-wr̥, but about its collective which became a numeral, severing all ties with its etymological source. As a numeral, it never had anything else but *-r.
ReplyDelete
Replies
Unknown16 October 2014 at 13:56
I fail to see how Miguel's paper justifies getting *c^etýre by substitution of the final vowel from *c^ety, which would be circumflex (*ketû < *ketwô < *kWetwo:r). Miguel's explanation of 'salt' and 'cow' has circumflexes regularly arising from the root-final resonants, and these were maintained when the root-nouns were transferred to the /i/-stems.

Miguel rightly rejects long grade in *bé´rme, 'burden' and must follow Derksen in positing a set.-extension to *bHer-. This is not ad hoc as shown by Lat. _praefericulum_ and Grk. _phéretron_ (from *bHerh1-). The root *k^leu- also has set.-behavior in Grk. _klûthi_ 'hear thou!' and Gmc. *xlu:da- 'easily heard, loud' (which shows that Dybo's Law did not apply before obstruents). Both Skt. _bhr.-_ and _s'ru-_ are among the eight verbs which must omit the connecting-vowel -i- with consonant-initial terminations in the perfect. It seems that *-h1, whatever it meant ('distinctly, separately' vel sim.?) was a fairly rare root-extension, but pre-Vedic Skt. still had to distinguish between the reflexes of *bHer- and *bHerh1-, *k^leu- and *k^leuh1-, etc. With the eight verbs, it was necessary to keep the set.- and anit.-forms separate until the former became obsolete as independent verbs.

Skt. _pa:vakah._ 'fire' (Upan. etc.) probably reflects PIE *pah2wn.-kó- whose form appears in Vedic _pa:vaká-_ 'clean, pure, bright', evidently confused with *pava:ká- 'id.' (indicated by metrics, KEWA 2:264) from _puná:ti_, _pávate_ 'cleans, purifies'. The WGmc 'funk' word meaning 'tinder, kindling, spark' etc. (OHG _funcho_, MLG _funke_, Eng. dial. _funk_) appears to be a weak noun based on Gmc. *funkka- < PIE *ph2un-k-nó-, with Upper Ger. dial. _Fanke_, _Vanke_ from secondary /o/-grade by analogy.

Andrew Byrd and the others seem to have published no objection to de Saussure's Effect occurring in the second part of a compound word, which is what I consider 'four' to be. Given the syllabication *kWet-h1wor-, deletion of *h1 is parallel to that of *h3 in one of the original examples, Grk. _loigós_ 'ruin, plague, reduction of many to few' against _olígos_ 'few'.
ReplyDelete
Replies
Unknown21 October 2014 at 02:12
I have a question of no real relevance that I was wondering if you had a quick answer to put down - my not-a-professional-being means I'm not in too good of shape to answer my own question sufficiently at the moment.

I was reading about correspondences between Japanese and Chinese and happened to think about PIE soon enough after reading some minutia on the origin of Mandarin's and the Japanese shift /p/ > /ɸ/ > /h/. I also thought of some labializing effect of the /β/ phoneme in some varieties of Spanish and the /v~u/ in Romani.

I was wondering if the apparent weirdness of PIE's labial system at the time (the distribution of *p *b and *bh) could be due to one of the labials softening into *h3 (hence it's "labializing" qualities) at a pre-PIE stage? Possibly either *p > *ɸ with *b > p or *b > *β.

I think a good dismissal would be serious and especially chaotic violations of phonotactics. I unfortunately don't have a good dictionary with me, and the ones I normally look at are down, at least on my end.
ReplyDelete
Replies
Octavià Alexandre22 October 2014 at 10:24
Oops! I meant the IE word for 'horse' is a Wanderwort shared with Caucasian and Sumerian (where it means 'donkey') and ultimately derived from a "Nostratic" lexeme which originally designated wild ungulates. Sorry for the off-topic.

On the other hand, Latin catēna 'chain' is an Etruscan loanword which in turn would have been originated in a satem reflex of IE *kʷet-. It looks like the ancestor of Etruscan was in contact with IE-satem languages, and more specifically, Baltic.
ReplyDelete
Replies
Unknown26 June 2015 at 22:43
Very glad I read this. I think you're right.
ReplyDelete
Replies

Add comment

Language Evolution

07 October 2014

Two Is Company, Four Is a Party

Reference

37 comments:

About me

Some really great blogs

Blog Archive

Popular Posts

Total Pageviews