When the journalist Roger Lewin in 1987 dubbed the common maternal ancestor of all people living today “ Mitochondrial Eve ,” he evoked a creation sto
When the journalist Roger Lewin in 1987 dubbed the common maternal ancestor of all people living today “ Mitochondrial Eve ,” he evoked a creation story —that of a woman who was the mother of us all, and whose descendants dispersed throughout the earth. The name captured the collective imagination, and is still used not only by the public but also by many scientists to refer to this com- mon maternal ancestor .
But the name has been more misleading than helpful. It has fostered the mistaken impression that all of our DNA comes from precisely two ancestors and that to learn about our history it would be sufficient to simply track the purely maternal line represented by mitochondrial DNA, and the purely paternal line represented by the Y chromosome.
Inspired by this possibility, the National Geographic Society’s “Genographic Project,” beginning in 2005, collected mitochondrial DNA and Y-chromosome data from close to a million people of diverse ethnic groups . But the project was outdated even before it began. It has been largely recreational, and has produced few interesting scientific results. From the outset, it was clear that most of the information about the human past present in mitochondrial DNA and Y-chromosome data had already been mined, and that far richer stories were buried in the whole genome. The truth is that the genome contains the stories of many diverse ancestors —tens of thousands of independent genealogical lineages , not just the two whose stories can be traced with the Y chromosome and mitochondrial DNA.
To understand this, one needs to realize that beyond mitochondrial DNA, the genome is not one continuous sequence from a single ancestor but is instead a mosaic. Forty-six of the mosaic tiles, as it were, are chromosomes—long stretches of DNA that are physically separated in the cell. A genome consists of twenty-three chromosomes, and because a person carries two genomes, one from each parent, the total number is forty-six.
Chromosomes. ( peterschreiber.media / Adobe)
But the chromosomes themselves are mosaics of even smaller tiles. For example, the first third of a chromosome a woman passes down to her egg might come from her father and the last two-thirds from her mother, the result of a splicing together of her father’s and mother’s copies of that chromosome in her ovaries. Females create an average of about forty-five new splices when producing eggs, while males create about twenty-six splices when producing sperm, for a total of about seventy-one new splices per generation. So it is that as we trace each generation back further into the past, a person’s genome is derived from an ever-increasing number of spliced- together ancestral fragments.
This means that our genomes hold within them a multitude of ancestors. Any person’s genome is derived from 47 stretches of DNA corresponding to the chromosomes transmitted by mother and father plus mitochondrial DNA. One generation back, a person’s genome is derived from about 118 (47 plus 71) stretches of DNA transmitted by his or her parents. Two generations back, the number of ancestral stretches of DNA grows to around 189 (47 plus 71 plus another 71) transmitted by four grandparents. Look even further back in time, and the additional increase in ancestral stretches of DNA every generation is rapidly overtaken by the doubling of ancestors. Ten generations back, for example, the number of ancestral stretches of DNA is around 757 but the number of ancestors is 1,024, guaranteeing that each person has several hundred ancestors from whom he or she has received no DNA whatsoever. Twenty generations in the past, the number of ancestors is almost a thousand times greater than the number of ancestral stretches of DNA in a person’s genome, so it is a certainty that each person has not inherited any DNA from the great majority of his or her actual ancestors.
DNA strand assembling from different elements. ( Tatiana Shepeleva / Adobe)
These calculations mean that a person’s genealogy, as reconstructed from historical records, is not the same as his or her genetic inheritance. The Bible and the chronicles of royal families record who begat whom over dozens of generations. Yet even if the genealogies are accurate, Queen Elizabeth II of England almost certainly inherited no DNA from William of Normandy, who conquered England in 1066 and who is believed to be her ancestor twenty-four generations back in time. This does not mean that Queen Elizabeth II did not inherit DNA from ancestors that far back, just that it is expected that only about 1,751 of her 16,777,216 twenty-fourth- degree genealogical ancestors contributed any DNA to her. This is such a small fraction that the only way William could plausibly be her genetic ancestor is if he was her genealogical ancestor in thousands of different lineage paths, which seems unlikely even considering the high level of inbreeding in the British royal family.
Going back deeper in time, a person’s genome gets scattered into more and more ancestral stretches of DNA spread over ever-larger numbers of ancestors. Tracing back fifty thousand years in the past, our genome is scattered into more than one hundred thousand ancestral stretches of DNA, greater than the number of people who lived in any population at that time, so we inherit DNA from nearly everyone in our ancestral population who had a substantial number of offspring at times that remote in the past.
Through random drift or selection, the female-lineage will trace back to a single female, such as Mitochondrial Eve. In this example over five generations colors represent extinct matrilineal lines and black the matrilineal line descended from mtDNA MRCA. (ChrisTi / CC BY-SA 3.0 )
There is a limit, though, to the information that comparison of genome sequences provides about deep time. At each place in the genome, if we trace back our lineages far enough into the past, we reach a point where everyone descends from the same ancestor, beyond which it becomes impossible to obtain any information about deeper time from comparison of the DNA sequences of people living today.
From this perspective, the common ancestor at each point in the genome is like a black hole in astrophysics, from which no information about deeper time can escape. For mitochondrial DNA this black hole occurs around 160,000 years ago, the date of “Mitochondrial Eve.” For the great majority of the rest of the genome the black hole occurs between five million and one million years ago, and thus the rest of the genome can provide information about far deeper time than is accessible through analysis of mitochondrial DNA. Beyond this, everything goes dark.
The power of tracing this multitude of lineages to reveal the past is extraordinary. In my mind’s eye, when I think of a genome, I view it not as a thing of the present, but as deeply rooted in time, a tapestry of threads consisting of lines of descent and DNA sequences copied from parent to child winding back into the distant past. Tracing back, the threads wind themselves through ever more ancestors, providing information about population size and substructure in each generation.
When an African American person is said to have 80 percent West African and 20 percent European ancestry, for example, a statement is being made that about five hundred years ago, prior to the population migrations and mixtures precipitated by European colonialism, 80 percent of the person’s ancestral threads probably resided in West Africa and the remainder probably lived in Europe. But such statements are like still frames in a movie, capturing one point in the past. An equally valid perspective is that one hundred thousand years ago, the vast majority of lineages of African American ancestors, like those of everyone today, were in Africa.
Genomic data visualization. DNA genome sequence, medical genetic map. ( MicroOne / Adobe)
The Story Told by the Multitudes in Our Genomes
In 2001, the human genome was sequenced for the first time—which means that the great majority of its chemical letters were read. About 70 percent of the sequence came from a single individual, an African American, but some came from other people. By 2006, companies began selling robots that reduced the cost of reading DNA letters by more than ten thousandfold and soon by one hundred thousandfold, making it economical to map the genomes of many more people. It thus became possible to compare sequences not just from a few isolated locations, such as mitochondrial DNA, but from the whole genome. That made it possible to reconstruct each person’s tens of thousands of ancestral lines of descent. This revolutionized the study of the past. Scientists could gather orders of magnitude more data, and test whether the history of our species suggested by the whole genome was the same as that told by mitochondrial DNA and the Y chromosome.
A 2011 paper by Heng Li and Richard Durbin showed that the idea that a single person’s genome contains information about a multitude of ancestors was not just a theoretical possibility, but a reality. To decipher the deep history of a population from a single person’s DNA, Li and Durbin leveraged the fact that any single person actually carries not one but two genomes: one from his or her father and one from his or her mother. Thus it is possible to count the number of mutations separating the genome a person receives from his or her mother and the genome the person receives from his or her father to determine when they shared a common ancestor at each location.
By examining the range of dates when these ancestors lived—plotting the ages of one hundred thousand Adams and Eves—Li and Durbin established the size of the ancestral population at different times. In a small population, there is a substantial chance that two randomly chosen genome sequences derive from the same parent genome sequence, because the individuals who carry them share a parent. However, in a large population the chance is far lower. Thus, the times in the past when the population size was low can be identified based on the periods in the past when a disproportionate fraction of lineages have evidence of sharing common ancestors.
Walt Whitman, in the poem “Song of Myself,” wrote, “Do I contradict myself? / Very well, then I contradict myself, / (I am large, I contain multitudes).” Whitman could just as well have been talking about the Li and Durbin experiment and its demonstration that a whole population history is contained within a single person as revealed by the multitude of ancestors whose histories are recorded within that person’s genome.
An unanticipated finding of the Li and Durbin study was its evidence that after the separation of non-African and African populations, there was an extended period in the shared history of non-Africans when populations were small, as reflected in evidence for many shared ancestors spread over tens of thousands of years. A shared “bottleneck event” among non-Africans—when a small number of ancestors gave rise to a large number of descendants today— was not a new finding. But prior to Li and Durbin’s work, there was no good information about the duration of this event, and it seemed plausible that it could have transpired over just a few generations— for example, a small band of people crossing the Sahara into North Africa, or from Africa into Asia .
The Li and Durbin evidence of an extended period of small population size was also hard to square with the idea of an unstoppable expansion of modern humans both within and outside Africa around fifty thousand years ago. Our history may not be as simple as the story of a dominant group that was immediately successful wherever it went.
Top image: Adam and Eve (rudall30 / Adobe Stock)
By David Reich
© [Oxford University Press]. Extract from Who We Are and How We Got Here: Ancient DNA and the new science of the human past by David Reich, published by Oxford University Press, available in hardback, paperback and eBook formats, £10.99