Where do words come from? We may form them by combining preexisting words or roots with derivational morphemes, but where do those preexisting components come from?
The fundamental grammatical units (morphemes and words) are sometimes created from scratch. Imitative and sound-symbolic words (meow, cuckoo, swoosh, bang), are obvious examples. Of course as soon as they become fully lexicalised, they may develop in parallel with other words and undergo the same sound changes, which may result in the loss of their iconic character. For example, Latin had the imitative verb pīpiō ‘pip, cheep (like a chick)’. The fact that we also have Ancient Greek pippízō or, for that matter, Finnish piipittää, all with the same meaning, tells us more about young birds (they made noises like [piː piː] in the past just as they do today) than about the languages that have “borrowed” the bird call in question. Such imitative word-coining must have happened independently many times. But Latin also had several words derived from the verb, among them post-Classical pīpiō (acc. pīpiōnem) ‘nestling, young bird’. That derived noun gradually developed a more specialised meaning, ‘squab, young pigeon (or its meat)’, and its form evolved according to regular sound changes. In the passage from colloquial Latin to French, the medial -pi- developed into *-βj- > *-βʤ- > -ʤ-, eventually yielding Old French pijon (~ pigon, pichon, pigeon) /piʤõn/, which was borrowed into fourteenth-century English (and spelt variously peion, pyion, pichon, pygeon, etc.). Modern French /piʒɔ̃/ and Modern English /'pɪʤən/ are not onomatopoeic any more. They have become “normal” words, for which the sound-meaning pairing looks entirely arbitrary. After all, they normally refer to adult pigeons today, and everybody knows that adult pigeons go “coo, coo” rather than “pee, pee”.
It is imaginable that many lexical roots began their existence in a similar manner, as iconic sound combinations mangled beyond recognition by the operation of historical sound change; note, however, that “imaginable” does not mean “demonstrable”. Most of the morphemes whose genealogy we can follow back to a remote past do not seem to have been onomatopoeic during their recoverable history. What is their ultimate source? We simply don’t know. If we say that an English word or lexical root is “of Proto-Indo-European origin”, we mean that at the present state of our knowledge we can trace it back to PIE and no further. If we say that it is “of Latin origin”, we point to Latin as the donor of a loanword and usually stop there; but of course the Latin source itself may have a deeper history in Italic, or it may be a Classical Latin derivative of something older, or it may be a loanword from a still different source (Greek, Etruscan, Gaulish, etc.), in which case the whole cycle repeats itself. Etymological dictionaries declare some words to be “of unknown origin” or “without etymology”, which means that at some point we run out of information as to their deeper origin. But to be honest, we always have to stop somewhere, even if “somewhere” refers to the oldest available reconstructed stage. The genealogies of words (or their parts) lead us deeper and deeper into the past, often forcing us to follow a complex trajectory of borrowing from language to language, or even between language families. In the end we either reach a point beyond which reconstruction is impossible, or lose track of the word in the uncharted wilderness of poorly documented languages. With relatively few exceptions (words known to be coined ab novo), the lexical items we study, or at least their morphological components, are indefinitely old.
Is there a way to refine our methods so that tracing words back to their source would not depend on our ability to reconstruct “protolanguages”? Words and morphemes are in principle immortal. Thanks to horizontal transfer, they can survive the death of languages they came from. For example, quite a few English words (ebony and ivory, among them) are certainly or possibly of Ancient Egyptian origin in the sense that we can follow their trail back to Ancient Egyptian via intermediaries such as French (or Spanish, or Arabic), Classical Latin, Ancient Greek, or Old Persian. Ancient Egyptian is as far as we go simply for lack of further clues. Perhaps there is a way to recover at least a handful of “global etymologies”, i.e. words whose reflexes occur in the language families of all the inhabited continents – not necessarily by virtue of being inherited from a common protolanguage but because they have managed to colonise the world thanks to a combination of inheritance and prehistoric borrowing?
|I don’t think we have been introduced.|
[Albrecht Dürer. Source: Wikimedia Commons]
To use a biological analogy, we know that the mitochondrial DNA of our species has a single source – an anonymous Paleolithic woman we dub “Mitochondrial Eve”. There was also a common male source of our Y chromosomes, a certain “Y-Chromosomal Adam”, who lived at a different time and place from Mitochondrial Eve. But that is not the end of the story: in the human genome there are a few hundred thousand regions for which family trees could be constructed, and each of them may be rooted in a different prehistoric individual, some “Adam”, “Eve”, “Jack”, “Jill” or “Algernon” who never met any of the others. Those trees, viewed separately, do not represent the ancestry of the whole species, or of identifiable subpopulations – just of selected genomic loci. Perhaps we can get round the “unreconstructible Proto-World” problem by concentrating on globally shared lexical units and leaving aside the question which particular languages they have passed through?
The discussion will be continued in the following posts.
[► Back to the beginning of the Proto-World thread]
[► Back to the beginning of the Proto-World thread]