Language Evolution: Mismatch in the Family

The fact that a language can be regarded as a bundle of coevolving replicators has important consequences for the family-tree model of language evolution. The family tree of a group of languages is the sum of the genealogies of those replicators that have formed coherent bundles (or “have been in the family”) for a sufficiently long time. Any branchings in the family tree tend to correspond to branching-points in the histories of the associated replicators. But this is only a statistical effect. Individual replicators may develop competing variants in the same speech community or invade different languages across communication barriers. Let’s imagine a situation (illustrated with the diagram below) in which a speech community undergoes internal differentiation into more than two languages. If the process were abrupt, we would expect all such splits to be binary. But a speech community is normally a network of numerous local or social sub-communities. Their historical individuation as separate languages takes time and proceeds gradually; innovations spread easily between mutually intelligible dialects. We have prolonged transitional periods between “a dialect network” (say, Vulgar Latin) and “a group of languages” (say, French, Spanish, Italian, Romanian, etc.), rather than a clean separation point. Little wonder that if replicators produce variants during the dialectal period, those variants do not have to undergo neat resolution, all in the same way, in the emerging languages. Quite the opposite, a good deal of mismatch can be expected. In our example, replicator A splits into two variants: A1 and A2, and replicator B splits into B1 and B2, all of them coexisting in the same language, Proto-XYZ, which then splits into X, Y, and Z. Let’s suppose that in each of the daughter languages only one variant of A or B survives and the other is lost. The variants may well end up segregated like this:

Conflicting testimony

X: A1, B1
Y: A1, B2
Z: A2, B2

The distribution of {A1, A2} suggests that X and Y share a common innovation (A1) and so are more closely related to each other than either of them is to Z. But the distribution of {B1, B2} shows a common innovation (B2) suggesting a closer relationship between Y and Z (to the exclusion of X). The mutual contradiction is only apparent, though. Not every common innovation of a cluster of languages arose after the separation of their most recent common ancestor from its relatives, and so none of them individually tells us much about the subgrouping of {X, Y, Z}. If nearly all replicators behave like A and only some are like B, we shall prefer {{X, Y}, Z}, but if the evidence is less robust, we may be unable to decide between {{X, Y}, Z} and {X, {Y, Z}} (or perhaps {Y, {X, Z}}, if we take more data into account).

As a real-world example, consider three Slavic languages: Polish, Czech and Slovene. According to handbook classifications, Polish and Czech are members of the West Slavic grouping, characterised by a cluster of shared innovations, for example the regular development of Proto-Slavic *tj, *dj into consonants traditionally transcribed *c, *ʒ (≈ IPA [ʦ, ʣ], a pronunciation preserved in Polish, where the spelling is c, dz; in Czech, *ʒ ends up as /z/, but *c remains /ʦ/). Slovene, a South Slavic language, shows a different phonetic development. The clusters in question are reflected as Slovene č and j (presumably via the palatal stops *tʲ, *dʲ ≈ IPA [c, ɟ]):

Proto-Slavic	Polish	Czech	Slovene
*světja ‘candle’	świeca	svíce	svẹča
*medja ‘boundary’	miedza	meze (< *meʒa)	meja

There are, however, other changes that tell a different story. The Proto-Slavic non-initial sequences *or, *er, when followed by a consonant, developed in different ways in different parts of Slavic. In Polish and in some other West Slavic languages the vowel and the consonant simply swapped places, yielding *ro and *re, respectively. But in the Slavic dialects ancestral to Czech and Slovak they followed the South Slavic pattern: the outcome was *ra, *rě, with vowels that can be regarded, in Proto-Slavic terms, as the tense counterparts of lax *e and *o):

Proto-Slavic	Polish	Czech	Slovene
*morkъ ‘twilight’	mrok	mrak	mrak
*berza ‘birch’	brzoza (< *breza)	bříza (< *brěza)	brẹza (< *brěza)

It would be fair to say that the Czech-Slovak group is on the whole “West Slavic” but “South Slavic” in several respects, including the treatment of vowel + *r sequences. As could be expected, there are other complications as well. For example, the groups *tj, *dj do not yield the same otcome everywhere in South Slavic. In Serbo-Croatian we find ć, đ (= IPA [ʨ, ʥ]), which can plausibly be derived from the same source as the Slovene variants, but in Bulgarian (as well as Old Church Slavonic) the development is highly idiosyncratic: št, žd. It is unlikely that the Slovene/Serbo-Croatian type and the Bulgarian one represent a single “Proto-South Slavic” innovation (in fact, the former seems more akin to the West Slavic development). The linguistic diversity of the South Slavic dialectal network must have been considerable even before it started to break up into separate languages, and some of the pre-split variation still persists. The same is true of West Slavic, which is hardly uniform and whose separation from South Slavic on the one hand and East Slavic on the other is not entirely consistent: the family trees of individual replicators often fail to match each other. In such cases it is difficult to represent the historical relationships among the members of the linguistic grouping as a neat phylogeny with clearly distinct branches.

6 comments:

ml201316 April 2013 at 00:49
Hi Piotr,

Out of curiosity, do you know of any asymmetries like the above in the case of lower-level Slavic branchings such as Eastern/Western South Slavic?

Regards
Piotr Gąsiorowski16 April 2013 at 08:51
I'm sure there are some, although it may be disputable which of them go back to "Late Common Slavic", and which result from more recent diffusion in the Balkan Sprachbund. For example, Macedonian clusters with Bulgarian in most respects, but there are also features connecting it with Serbian. The treatment of *tj, *dj (Macedonian /c, ɟ/ rather than /ʃt, ʒd/) is of the "Western South" type, though it has been argued that it reflects relatively recent South Serbian influence.
Hans17 April 2013 at 16:24
My impression is that "South Slavic" is just lumping together all extant Slavic languages that are neither West nor East Slavic. I wouldn't be able to name any common South Slavic feature that wouldn't be shared by some East or West Slavic languages as well.
Piotr Gąsiorowski17 April 2013 at 17:14
I agree. It's an areal grouping of mixed origin, not reducible to a "Proto-South Slavic" language. There was no single migration of Slavic speakers into the Balkan region. I didn't say so explicitly above, but Proto-XYZ in my diagram should actually be identified with Proto-/Common-Slavic (with only three selected descendants shown in the picture as a matter of didactic simplification).
Rob22 August 2016 at 01:14
Hi Piot
Would you opine that East Slavic is the most homogenous 'grouping' ?
Legal Translation Company in Dubai28 March 2022 at 13:52
UAE CABINET RESOLUTIONS
Kuwait Labour Law
Qatar Labour Law 2021
saudi arabia laws
English to Croatian Translation
English to Irish Translation
English to Pashto Translation

Language Evolution

14 April 2013

Mismatch in the Family

6 comments:

About me

Some really great blogs

Blog Archive

Popular Posts

Total Pageviews