Question 2: Are all the recorded languages ultimately related?
We normally say that two languages are related if they go back to a common ancestor. But we have already seen that “common ancestry” is a tricky notion in the case of entities with easily permeable fuzzy boundaries. To say that, for example, Latin and Sanskrit had a common ancestor is shorthand for saying that the most conservative core of their lexicons consists of linguistic replicators whose genealogies can be traced back to “the same” speech community, delimited in space and time. The core is quite thick, to be sure – several hundred lexical items with cognates in the other language. They show evidence of having undergone the characteristic sound changes that have affected each lineage during its separate history, they display similar inflectional patterns, etc. They are homological structures, not mere lookalikes. But the fact remains that both Latin and Sanskrit also contain thousands of lexical units whose history is more complicated. Some have been borrowed from one branch of Indo-European into another, others come from outside the family. For example, Latin has loanwords borrowed from Etruscan, a non-Indo-European language of ancient Italy (in addition to inner-IE loans from Greek, Gaulish, the extinct Italic languages, etc.); Sanskrit in turn has numerous words apparently imported from extinct and otherwise lost ancient languages – the enigmatic linguistic substrates of Central Asia and the Indian subcontinent.
The “core” vocabulary tends to erode away with the passage of time. The branches of the family trees we reconstruct do not represent complete languages but only their most durable cores, which get thinner and thinner as we run our reconstruction back in time. Proto-Indo-European is still a solid construct because Indo-European is a vast family with some excellently documented representatives providing first-class historical evidence, but many language families are defined on a much shakier basis. Uralic is not bad at all, but its lexical core shared between the primary branches of the family amounts to something like 200 reconstructible roots. Afroasiatic, with just a few dozen uncontroversial proto-morphemes, is already a borderline case. “Relatedness” understood as shared ancestry makes sense as long as we can support it with a large number of words and morphemes showing systematic phonological correspondences. If all we can parade as evidence is a handful of imperfectly matching lexical roots and some similar-looking inflectional endings, “relatedness” evaporates and cognacy becomes indistinguishable from accidental similarity.
It is quite possible that high-frequency units for which we can predict the lowest rate of lexical replacement and the longest survival time – for example personal pronouns – may be retained via vertical inheritance long enough to suggest remote relationship between otherwise distinct language families. This might be the source of some curious cross-family correspondences like the M-T phenomenon in several language families of northern Eurasia (where the nasal /m/ tends to occur in first-person pronouns and a coronal obstruent in second-person pronouns). But such evidence, no matter how tantalising, is hardly sufficient to demonstrate a “superfamily” relationship if not backed up by a substantial amount of data to which the comparative method could be applied to rule out chance agreement.
If the applicability of the family tree model is limited in this way, perhaps we should focus on individual linguistic replicators – the stuff of which languages are made – rather than languages themselves. It could be argued that despite horizontal diffusion the genealogies of related replicators will still converge at some point in the past. Their family trees will not be isomorphic with language phylogenies, but a borrowed morpheme also has its deeper history in the language it came from. Even if the notion of “language relatedness” can’t be extended ad infinitum, it is imaginable that most replicators, whether transmitted vertically between generations of speakers or horizontally between different speech communities, eventually coalesce with their relatives in one and the same ancestral speech comunity somewhere in the deep prehistory of language. I see no easy way to disprove such a possibility, but I see no way to prove it either.
|The kind of thing I do not trust at all|
Relatedness can be tested for items with reconstructable histories, because we know what regular changes they can be expected to have undergone along the way, and what correspondences they should exhibit. Without that knowledge, anything could be related to anything else. A long sequence of phonological changes can distort a word beyond recognition; semantic shifts can change its meaning. With a little bit of imagination it’s easy to invent an arbitrary scenario relating a word in Basque to a word in Georgian, Hungarian, Sumerian, or any language of one’s choice. It’s a popular sport among amateur long-range comparatists, but it is not the way sound historical linguistics should be practised. It is wiser to admit out ignorance than to use dubious methods to get untestable results. So my well-considered answer to Question 2 is, “I have no clue”. The null hypothesis in such cases is always that A is not related to B unless there is sufficient evidence to conclude otherwise. I apologise if this attitude sounds unromantic.
[► Back to the beginning of the Proto-World thread]
[► Back to the beginning of the Proto-World thread]