Proto Language

If you are ever blind-sided by some surprise blow, your first question, as you sit there on your tuchis, is likely to be, “Where the heck did that come from?”

Scientists are no different.

No blow blind-sided the world of medicine with more impact than the AIDS epidemic.  It seemed to spring up from nowhere with ferocious infectiousness and 100% lethality.

Nowadays, we would modify both of those.  As such things go, HIV is not terribly infectious (Now, there’s a thought!).  Nor is it 100% fatal, although it is still pretty darned close.

What it is, however, is frighteningly mutative.  Within the two main types of the virus, HIV-1 and HIV-2, there are literally dozens of subtypes.  This ever-changing diversity has made prevention, treatment and cure of the disease maddeningly difficult.  Oddly enough, though, it has actually helped answer the question of where the virus came from.

Another way to phrase that question is to ask whether HIV, as such, had come intact from some animal reservoir or whether it had jumped species by changing itself.  A few years ago a genetic comparison between HIV-2 and the form of Simian Immunodeficiency Virus (SIVsm) found in the Sooty Mangabey Monkey (Green Monkey) established that HIV-2 was derived from that form of SIV but was not identical to it.

In 1999, the same type of comparison established a similar connection between the more virulent form, HIV-1, and the version of the chimpanzee SIV (SIVcpz) found in a specific subgroup of chimpanzees.

This answered the question of where HIV-1 came from (i.e. it had not come intact from some natural reservoir but mutated its way from ape to man), but opened the question of how and when.  The dark side (those wonderful folks who can come up with a conspiracy as the answer for any problem) immediately suggested that the West’s use of chimpanzees in medical experiments had probably created the virus.  They added that the Belgian polio vaccination program in Africa in the 50’s was the mechanism of its spread.

That theory was tied to a definite time and place.  The place was hard to prove, but the timing offered a chance to get evidence for or against it.  Did AIDS actually begin its spread in the late 50’s?  Were there any earlier occurrences?

So far as actual physical evidence, the opportunities for proof are slim.  There are just not too many tissue samples extant from before the 80’s.  There was a Norwegian sailor with HIV positive tissue who died around 1976.  There was a teenager from St. Louis in 1969.  There was one dubious sample from the Congo taken in 1959 that could have been HIV-1.  That’s about it.  Clearly none of them could answer a how and when question from the 50’s.

Dead end?  Not quite.

Tada!  The supercomputer to the rescue.

A group of researchers decided to take advantage of the mutability of the virus and the power of the supercomputer to do some regression analyses.  All of those sub-types of HIV, far from making the problem insoluble, actually were essential to its solution.

It goes like this:  Interpolating from all the subtypes, one can get a pretty good estimate of the average time between mutations.  Analyzing near cousins allows one to determine the number of mutations separating subtypes.  From this, we can begin to construct a family tree.  Using the supercomputer (in this case, the one in New Mexico delightfully named Nirvana), we can begin to calculate backwards, filling in the intermediate mutations and their times.  Eventually we reach the common point where all the lineages converge.

The result:  The most likely time of emergence of HIV-1 was 1930.  That’s right, 1930, long before most of the chimpanzee medical experiments and long before the polio vaccine.

There is a qualifier.  Regression analyses multiply the initial errors.  The farther the extrapolation, the greater the margin for error.  In this case, the margin of probability extends as far back as nearly 1910 and as far forward as the late 40’s.  The entire exercise shows both the power and the limitation of this kind of analysis.

Which brings me to the Proto World Language.

Say what?

Regression is a powerful tool?  Well, there is another huge set of data that tempts one to regress.  This is the great fund of linguistic cognates.

In the wonderful world of linguistics, cognates are words in different languages that are so similar they (probably) have a common origin, like cold in English and kalt in German.  Trace the words back and you will find them converging to a common past source language.  Of course, in the days before computers, it took endless hours by tireless cognoscenti to work out the derivations.  This sort of word play delighted the nineteenth century linguists.  Hence the entire field of Historical Linguistics.

There are hundreds of examples of cognates in the Indo-European family of languages.  The closest matches are found in the most basic, common words.  Take, for instance, the English word brother.  In Sanskrit, it is bhratar.  In Greek, it’s phrater.  In Latin, it’s frater.  In Gothic, it’s brothar.  Or the English word father.  In Sanskrit, it is pita.  In Greek, it’s pater.  In Latin, it’s pater (again).  In Gothic, it’s fadar.

To the linguists, these cognates suggested the wonderful notion that perhaps all of these similarities are due to all Indo-European languages having derived from some long ago common language.  What an idea!

Unfortunately, to confirm our comparative studies, we need samples.  Not just from the descendants, but from the source as well.  We would love to have pieces of this theoretical common source language, but we don’t.  In fact, we can’t even get too close.  In languages, you don’t have to go too far back before you run out of samples.  There must have been thousands of years when human beings spoke but did not write.  It is enough to make the Historical Linguist’s teeth hurt.  You can just look at the commonality between the words and know that they once derived from a common language, a Proto-Indo-European.  And it is lost forever!

At this point a silver tongued tempter begins to whisper in the ear.  “No,” it says, “you cannot actually get relics of Proto-Indo-European.  It is too far in the past.  But look at all those computers.  Think about Regression Analysis.  Why with the right grant and enough money, you can reconstruct Proto-Indo-European.  Think of it!  Your Nobel Prize awaits!”

I shouldn’t make too much fun of this, as a lot of reasonably solid work has been done in the attempt to extrapolate back to the source language of the Indo-European family.  Hardly generally accepted as a whole, many of the derived source words for common items like “water” and “foot” are recognized as usefully accurate.  It is still true that naked extrapolation too far back raises the probable error beyond all reason.  In the nearer term, however, the analysis becomes as much a matter of interpolation as extrapolation.

But the devil is not so easily thwarted.  Soon the tempter is back. The problem, you see, is that some non-Indo–European words are also temptingly familiar.  For instance, in Arabic, father is ab while in Hebrew it is aba.  Well, they are close enough, no surprise.  But can they be related to our papa?  In Chinese, father is baba.  How about that?  In Arabic, mother is om.  In Hebrew, ima.  In English, mama.  In Chinese, it’s mama!

“Suppose,” the tempter whispers, “there was an common language even earlier than Proto-Indo-European.  Suppose there was once a single common language for the whole species!  A Proto-World Language!  All those computers await.  Why, the Nobel Prize wouldn’t be half enough!”

And so there is a sad group, condemned by their innate inability to comprehend the heartless mathematics of regression and increasing error margins, who are trying to reconstruct humanity’s first language from which all the others derived.  And I’ll bet each one has a nice clean, Nobel Medal sized spot on the wall of their office just waiting for that call from Stockholm.

But again I suppose I shouldn’t be too hard on their well-intended efforts.  For, surprise, surprise, we may just be sharing a mutual destination.  The truth is that as the world gets smaller, we seem to be becoming an exercise in progressive analysis where our world and its languages are starting to share more common words and ideas, merging towards one.

Thinking about it, I suppose we have been doing that for a long time…with a certain level of confusion.

The other day I was driving with a friend.  As I turned onto the freeway on-ramp I wanted to say something about those traffic lights they put on the ramps to regulate them, but I couldn’t remember what we call them.  Then it popped into my head:  Meter.  Ah, yes.  Traffic meters.  The term actually comes from the Greek word metron, to measure, but in school I remember being taught it came from the Greek word meter, mother.  That seems better, somehow, sitting there, being metered, waiting for permission to go.

Mother, may I?

