What is a “reconstructed
protolanguage” like Proto-Indo-European? It’s customary to define it as the
most recent common ancestor of a family of languages. If the structure of
relationships within the family is
represented as a clearcut phylogenetic tree, it is even possible to offer a formal
definition based on a pair of languages, A and B, each belonging to a different
primary branch of the family (produced by the oldest split in its history). Thus,
Indo-Europeanists generally agree that the Anatolian group (Hittite, Luwian and
their close relatives) had split from the “core” part the family before Core IE
underwent further fragmentation. By putting it in this way we ascribe a “basal”
status to Anatolian and tend to see the other branch as “PIE proper”. That’s
because Core IE contains all the modern languages and all the familiar (and
excellently documented) “classical” ones like Latin, Ancient Greek, and Vedic. We
should realise, however, that the privileged position of Core IE with respect
to Anatolian is merely an artifact of the history of IE studies and of the (accidentally)
unequal written attestation of the two primary branches. It is more reasonable
to say that the common ancestor bifurcated into a pair of daughter languages,
Proto-Anatolian and Proto-Core (rather than to insist that either of them
“split off first”). The fact that Proto-Anatolian has left no contemporary
descendants is an accidental consequence of the vagaries of history, not of its
“basal” or “less advanced” status.
Getting blurry Hat tip: Jo Verrent |
Deep coalescence |
- nom./acc.sg. *dóru, gen.sg. *déru-s
However, no
IE language preserves anything remotely like this. On strictly comparative
grounds, without trying to make sense of the paradigm and its relation to other similarly behaving nouns, one could reconstruct something like *dóru/*dorw-ós (Indo-Iranian
disagrees, suggesting *dóru/*dréu-s instead), and leave it at that. However, we have reasons to believe that
such forms are analogical, reflecting various attempts to level out vowel
alternations or replace the unproductive acrostatic pattern with a more common
mobile one. If the reconstruction *dóru/*déru-s is correct, it belongs to a deep
chronological stratum in the prehistory of PIE itself. It was in all likelihood
replaced by “regularised” variants like *dóru/*dérw-os (with a vowel inserted in the genitive suffix), *dóru/*dorw-ós (with the root alternation eliminated and the genitive ending accented, as in more productive mobile patterns), and
possibly others, before the disintegration of PIE unity. Quite possibly by
that time the “original” paradigm had already been abandoned and forgotten. The
large amount of hesitation and paradigmatic inconsistency found even in the most conservative languages on either side of the Anatolian/Core divide suggests that
much of that polymorphy was inherited from the common ancestor. Fully coalescent
forms must therefore be older and can’t be dated with much accuracy. Reconstructed PIE is only
roughly bounded in time (and even less so geographically, since even the
approximate location of the IE homeland remains a moot question). The more we
rely on abstract analyses to recover “the oldest” forms of alternating
paradigms and to understand the origin of the alternations, the less precisely their
chronology can be determined.
Note that
we are talking of a well-studied language family with 400+ extant members and a
written record beginning in the second millennium BC. The PIE reconstruction is
a monumental intellectual achievement, and yet it isn’t “a language” that could
be ascribed to any single speech community at any time. It’s a large set of
coalescent reconstructions distributed in time and possibly in space as well. Other
protolanguages, even relatively uncontroversial ones, are usually still more
nebulous. If we ever manage to prove that the IE languages are related to some
other established family, the reconstructed features of the common ancestor
will naturally be even harder to constrain, and the protolanguage itself more
elusive and fragmentary. It is hard to predict how far back in time
our best reconstructive methods can take us before the notion od
“protolanguage” becomes too vague to be meaningful. We can only resolve this
question empirically, by putting our methods to extreme tests. If we
consistently fail, it may mean that we have already reached the limit. Fortunately,
there is no shortage of enthusiasts undaunted by the difficulties of long-range
comparative research. Their efforts are necessary and praiseworthy, but the
results so far have been rather disappointing. Only time can tell if further
progress can be achieved. But even if some “superfamily” groupings eventually
win general acceptance, we are still light-years away from reconstructing anything
resembling Proto-World. My personal view, which I have tried to justify in this
series of posts, is that linguistic lineages are too ill-defined to remain identifiable at great depths of time.
Language phylogenies are not real objects but products of our analysis, with
all its limitations and simplifying assumptions. They are defined from a
certain point of view, relative to an arbitrarily chosen historical frame of
reference. If you try to extend them far beyond that frame, their boundaries,
fuzzy to begin with, just blur away. There is therefore no way to define
Proto-World properly as “a language”, let alone reconstruct its features.
But perhaps we can talk meaningfully of global etymologies without insisting on the reconstruction of Proto-World? This will be the topic of the next post.
[► Back to the beginning of the Proto-World thread]
[► Back to the beginning of the Proto-World thread]
I think that we cannot define protolanguages as equivalent to most recent common ancestors. The most recent common ancestor of the Indo-European languages undoubtedly existed, and we know some things about it. But to suppose that it looked exactly like the product of reconstruction is not reasonable. In particular, reconstruction smooths away all irregularities, but the principle of uniformity in time (that our time is not special) says that the actual MRCA was just as irregular as its descendants.
ReplyDeleteI agree. It's the coalescent nature of reconstructions that makes them look uniform, but the actual points of coalescence for different lexical and grammatical features are unlikely to coincide in time with each other, let alone the MRCA. My whole point is that the two things (the MRCA and the product of the comparative/internal reconstruction) should not be confused with each other.
ReplyDeletethe (accidentally) unequal written attestation of the two primary branches
ReplyDeleteI suspect that this sort of inequality is actually the most likely outcome. If I get a chance I'll write a little simulator that generates random binary trees such that each leaf node either goes extinct or generates two descendants with fixed probability. I suspect that most trees will be extremely unbalanced, with most of the descendants from one branch and only a few from the other at every level. Unfortunately I do not know a metric for unbalancedness offhand.
It's what biological family trees usually look like too. There are exceptions -- e.g. the earliest amniotes split almost immediately into the "mammalian" and the "reptilian" branches, which remain more or less equally diverse after 300 million years (well, actually there are many more modern species on the reptile/bird side, but we mammals like to think that we have been especially successful and that we are living in the Age of Mammals, so there's no way we could be basal amniotes). But in most cases we end up with a large "favourite daughter" plus its less lucky "basal" sisters.
ReplyDeleteI like that you draw attention to internal reconstruction's temporal ambiguity, but I'm not sure it's justified to then extend that to all reconstruction. The comparative method, more strictly applied, will provide features of the MRCA pretty much by definition. Or at least, it can't get at older features - parallel development and convergence are problems, but those are things that often leave more discoverable traces if you pay good attention to relative chronology (palatalization Old English and Old Frisian are a good example of a development that clearly postdates separate innovations in each daughter, and so can't predate any 'Proto-Anglo-Frisian').
ReplyDeleteThis hardly removes all chronological uncertainties, since the comparative method and internal reconstruction are often both applied hand-in-hand, but I think it's not good to gloss over just how much of the MRCA comparison really can recover. And it's also only internal reconstruction that will smooth out regularities: direct comparison of two irregular paradigms should result in the reconstruction of an irregular paradigm (maybe an acceptable example would be the reconstruction of a paradigm like *wurkjan 'to work', past *wurhtē in Proto-Germanic, where this and a few other verbs anomalously don't have a stem vowel *-i- before the preterite ending, a situation preserved directly in most of the older Germanic languages).
The biggest problem with something like Proto-World seems to be data rather than method. If irregular features get levelled out in different ways, the only method that can be applied is the less constrained technique of internal reconstruction. This is already the case to a fairly large extent in PIE; as you say, using this itself as a basis for reconstruction gets sketchy fast.
The comparative method, more strictly applied, will provide features of the MRCA pretty much by definition. Or at least, it can't get at older features...
ReplyDeleteIt provides features of the MRCA of the forms being compared. But the question arises whether the different MRCAs recovered in this way really belong to "the same language" (i.e., if they really coexisted in the same ancestral speech community). The fact that languages are non-uniform populations subdivided into varieties militates against such a view of the reconstructed protolanguage. For example, let's consider Anglo-Frisian brightening (*a > *æ). The change took place in "Proto-Anglo-Frisian", which makes it pre-English, but dialectal contrasts such as Anglian cald, calf : West Saxon ċeald, ċealf are according to some authors (Richard Hogg, for example) more parsimoniously analysed as reflecting different brightening-blocking contexts in different parts of the Proto-Anglo-Frisian dialect network. The "most recent word ancestors" *kald-, *kalβ(-r)- are in this case older than the "most recent language ancestor".
P.S. I should have added that Frisian has ald, kald, half, salt, etc., like Anglian but unlike WS.
ReplyDeleteYes, dialect variation can add some uncertainty, just like parallel innovation. But I think it's fair to say that it's a kind of difficulty that the comparative method is fairly sensitive to. In the case of OE and Frisian, the data speaks directly against reconstructing a unified 'Anglo-Frisian', to which processes like palatalization had applied in a shared, single history (Patrick Stiles makes this argument in his 1995 paper). You say that *kald- and *kalv(r) are older than the most recent common ancestor of English and Frisian - but I'm not sure that's really the proper way to look at it. At least in this case, we're looking at innovations, possibly parallel and possibly genuinely connected, spreading with observable differences across meaningfully differentiated dialects - and the data tells us this is what happened. In other cases, such as with the Ingvaeonic 'Nasalspirantengesetz', an innovation seems to spread with less variation across a set of grammars (in individuals' heads ultimately, of course) without any significant observable differentiation between them. Presumably this is because it was affecting a smaller and less differentiated speech community, rather than the sprawling dialect continuum that later Ingvaeonic and then Old English would become (or, it might be better to say, that the later Ingvaeonic dialect continuum probably grew out of a smaller slice of an earlier dialect continuum).
ReplyDeleteIt seems like there is a real difference between those kinds of innovations. The success of the comparative method without too many problems in a remarkable number of cases strongly suggests that the latter situation is not terribly uncommon; the many problems it runs into indicates that the former one is also pretty common. But in cases where there _is_ a 'MRCA' (meaning small and only very lightly differentiated dialects at the most varied), the comparative method ought to get at it. Careful attention to relatively chronology and the like are pretty important to making sure the comparative method works properly, but that's hardly a new idea.
But in cases where there _is_ a 'MRCA' (meaning small and only very lightly differentiated dialects at the most varied), the comparative method ought to get at it.
ReplyDeleteWell, if. When dealing with related languages we can always apply the comparative method, no matter what the MRCA looked like (and how homogeneous it was). How do we reconstruct things like the Proto-Indo-European genitive of the 'tree' word? Hittite has nom.sg. tāru, gen. GIŠ-ruwas, most likely reflecting *dor-u, *dorw-os (but the vowel of the oblique stem is concealed by the Sumerogram); Sanskrit has dā́ru, drós, as if from *dór-u, dr-éu-s (cf. Av. dāuru, draoš, Greek has dóru, dóratos/doúratos ~ dourós. Is there any sensible way in which we can compare such paradigms directly, without resorting to internal reconstruction? And if you look at nouns like 'sun' or 'fire', there are immense complication even withing Germanic, not to mention PIE.
I fully agree that in cases like the PIE accentual paradigms, direct comparison simply doesn't work (not uncommon for morphology) and internal reconstruction is necessary. And that adds more doubts and chronological uncertainties to the picture. It's basically a matter of quality of data: better data allows for the application of better methods and at least somewhat more secure reconstruction of actual 'languages'. That, I think, is at least part of the point you've been making, that for Proto-World our data is not good at all, and the methodology for reconstructing it is correspondingly tenuous.
ReplyDeleteGreat post, Piot
ReplyDeleteI wonder if *dérus, accessible only by internal reconstruction, was uniformly replaced by *dorwós on the way to PIE, and this in turn was replaced by *dreus on the way to Proto-Indo-Iranian. Does this make sense based on the many things you know and I don't?
ReplyDelete*do/eru- is not an isolated example but part of a pattern. We have several other stem neuters of this type: *h₂o/eju- 'vitality', *ǵo/enu- 'knee', *so/enu-, and probably *ḱo/eru- 'horn' (though the 'horn/head' lexical superfamily is so complicated that Alan Nussbaum had to devote a full-length monograph to it). There's evidence of proterokinetic weak cases for all of them except the last in Indo-Iranian (with *-o- in the strong cases, as shown by its Brugmannian lengthening). But the allomorph with zero-grade in the root and e-grade in the suffix turns up in other branches as well (suffice it to mention English tree and knee). It looks as if alternative case-forms had existed already in PIE and had not been fully segregated until much later in the history of the daughter languages.
DeleteA missing gloss: *so/enu- 'back, spine', as in Ved. sā́nu, gen. snóḥ.
DeleteI never thought of the vocal aspect to human communication as a language, per se. At least not in a intellectual sense. That is, in the sense of visual thinking and human intelligence. You know, when you want to flesh something out to the n’th degree, you illustrate it. When you want to communicant in another culture, you draw a picture. The vocal aspect of language certainly holds an emotional component to our visual imagination and those ideas, because our emotions are closely linked to our aural perception. However, vision, is as cold as ice, because it’s about an array of possible imagined futures down range as far as the eye can see. I mean, why should we get emotionally worked up about something that may or may not happen?
ReplyDeleteIOWs, our intellect is void of all emotions, and thus I find the vocal aspect to our communications imbues emotional meaning in any given context, in the same way our dreams imbue emotional context to our fearless imagination. However, I find it’s our visual imagination is where language finds its roots. Thus, my question is what is the importance of studying the historical trajectory of emotional outburst? The thousands of vocal dialects seem incidental, when there is but one universal visual intelligence and its language.
our intellect is void of all emotions
DeleteI doubt that such a separation exists. After all, all motivation is emotion. A Vulcan who has undergone the full kolinahr has no reason to do anything, and that includes obeying the "dictates" of "logic".
there is but one universal visual intelligence and its language
Not so. Not only do different cultures give different connotations to visual symbols, including colors; but the languages you speak even influence which colors you see as very similar shades of the same thing vs. as completely different. See a drastic example here.
language Translation Services
ReplyDeleteHuman Resource Material Translation
Translation Services
Memorandum of Association Translation
Journalistic Documents Translation
Product Profile Translation
Management Translation
Legal Translation Services in United Arab Emirates