Tag Archives: Neanderthal

Siberian Cave Yields Oldest Fossils Belonging to Enigmatic Human Species

The entrance to Denisova Cave in Sibera.
Image: IAET, Siberian Branch Russian Academy of Sciences

The Denisovans, a mysterious group of extinct hominins closely related to Neanderthals, didn’t leave much fossil evidence behind. A fresh dig at their former stomping grounds in Siberia has now yielded three new fossils—the oldest yet found of this species.

Katerina Douka, an evolutionary anthropologist from the University of Vienna, and her colleagues found the fossils in Denisova Cave, a natural shelter located in the Altai Mountains of southern Siberia. The scientists were studying the oldest layers in the cave, which up until now had failed to produce a single human fossil. A total of five human fossil fragments were recovered: three belonging to Denisovans, one from a Neanderthal, and one that could not be identified. The largest of these fragments measures no larger than 1.6 inches (4 centimeters) long.

Remarkably, this small but precious handful of fossils were found amid a jumble of 3,791 animal bone fragments. The researchers used a biomolecular method known as peptide fingerprinting to identify the bones, as it was not possible to do so through manual inspection. The five bones contained collagen consistent with the peptide profiles of humans (peptides are the building blocks of proteins), allowing for the identification (as a reminder, Denisovans and Neanderthals are humans).

Some of the bone fragments recovered in Denisova Cave.
Image: S. Brown

“Finding one new human bone would have been cool, but five? This exceeded my wildest dreams,” Samantha Brown, a co-author of the study and junior group leader at the University of Tübingen, said in a Max Planck Institute statement.

Denisova Cave is an “amazing place” when it comes to the preservation of DNA, and “we have now reconstructed genomes from some of the oldest and best-preserved human fossils,” said Diyendo Massilani, a geneticist from the Max Planck Institute of Evolutionary Anthropology, in the release. The team found enough DNA to reconstruct mitochondrial genomes, allowing them to confirm the bones as belonging to Denisovans and Neanderthals. A paper detailing this discovery has been published in Nature Ecology & Evolution.

The layer containing the Denisovan bones was dated to roughly 200,000 years old. Previous Denisovan fossils were dated to between 122,000 and 194,000 years old, so these are now the oldest. The lone Neanderthal bone was dated to between 130,000 and 150,000 years ago. The Altai mountains appear to “be an overlapping zone for both Denisovan and Neanderthal groups for over 150,000 years, witnessing and possibly facilitating population [interbreeding] as well as sustaining distinct hominin populations over this long period,” according to the paper.

The three new Denisovan fossils add to the six already discovered, including a finger bone from which DNA was extracted and a mandible found in a cave on the Tibetan Plateau—the first and only Denisovan fossil found outside of Siberia. Denisovans were closely related to Neanderthals, and they interbred with modern humans prior to going extinct some 50,000 to 30,000 years ago. Traces of Denisovan DNA exist within the genomes of modern southeast Asian and Oceanian populations.

As the scientists write in their study, “a wealth of archaeological material” in the form of stone tools and animal remains were found within the Denisovan layer. It’s the first time that archaeological evidence has been definitively linked to these hominins, allowing for new insights into their behavior.

Interestingly, the style of the recovered stone tools, like scraping tools for working animal hides, could not be matched to any known lithic tradition. Living near the Anui river and occupying the caves during a warm period, the Denisovans hunted bison, deer, gazelle, antelope, and woolly rhinos, in a subsistence pattern that lasted for thousands of years, as the researchers point out.

These three Denisovan bones will likely produce more science in the years to come, as will Denisova Cave in general. Slowly but very methodically, we’re uncovering more about these remarkable humans.

More: Scientists Have Discovered a Hotspot of Denisovan Ancestors.

Read original article here

Ancient case of disease spillover discovered in Neanderthal man who got sick butchering raw meat

Researchers were reexamining the fossilized bones of a Neanderthal who was found in a cave near the French village of La Chapelle-aux-Saints in 1908. The “Old Man of La Chapelle,” as he became known, was the first relatively complete Neanderthal skeleton to be unearthed and is one of the best studied.

More than a century after his discovery, his bones are still yielding new information about the lives of Neanderthals, the heavily built Stone Age hominins that lived in Europe and parts of Asia before disappearing about 40,000 years ago.

The man, thought to be in his late 50s or 60s when he died about 50,000 years ago, had advanced osteoarthritis in his spinal column and hip joint, a study from 2019 had confirmed.

However, during that reanalysis, Dr. Martin Haeusler — a specialist in internal medicine and head of the University of Zurich’s Evolutionary Morphology and Adaptation Group at the Institute of Evolutionary Medicine — realized that not all the changes in the bones could be explained by the wear and tear of osteoarthritis.

“Rather, we found that some of these pathological changes must be due to inflammatory processes,” he said.

“A comparison of the entire pattern of the pathological changes found in the La Chapelle-aux-Saints skeleton with many different diseases led us then to the diagnosis of brucellosis.”

The study with those findings was published in the journal Scientific Reports last month.

Zoonotic disease

Brucellosis is a disease that’s still widespread today. Humans generally acquire the disease through direct contact with infected animals, by eating or drinking contaminated animal products, or by inhaling airborne agents, according to the World Health Organization. Most cases are caused by unpasteurized milk or cheese from infected goats or sheep.

It’s also one of the most common zoonotic diseases — illnesses that are transmitted from animals to humans. They include viruses like HIV and the coronavirus that caused the Covid-19 pandemic.

Brucella has a wide range of symptoms, including fever, muscular pain and night sweats, Haeusler said. It can last from a few weeks to many months or even years. Long-term problems resulting from the disease are variable but can include arthritis pain, back pain, inflammation of the testes — which can lead to infertility — and inflammation of the heart valves known as endocarditis, which Haeusler said was the most common cause of death from the disease.

The paper said the case was “the earliest secure evidence of this zoonotic disease in hominin evolution.”

The disease has also been found in Bronze Age Homo sapiens skeletons, which date back to around 5,000 years ago.

Diet

Brucellosis is found in many wild animals today, and Haeusler said that the Neanderthal man likely caught the disease from butchering or cooking an animal that had been hunted as prey. Possible sources include wild sheep, goats, wild cattle, bison, reindeer, hares and marmots — all of which were components of the Neanderthal diet. However, the paper said that the two large animals Neanderthals hunted, mammoths and woolly rhinoceros, were unlikely to be the disease reservoir — at least based on the animals’ living relatives, in which brucellosis has been largely undetected.

Given the man lived to what must have been a very old age for the period, Haeusler suspected that the Neanderthal may have had a milder version of the disease.

The “Old Man of Chapelle” played a significant role in misconceptions about Neanderthals being primitive Stone Age brutes, according to the Smithsonian. More recent research suggests that they were just as smart as we are.

An early reconstruction of the skeleton depicted the man with a slouching posture, bent knees and the head jutted forward. It was only later that scientists realized the skeleton had a deforming kind of osteoarthritis and perhaps was not a typical Neanderthal.

Haeusler said the study he published in 2019 showed that, even with the wear and tear from degenerative osteoarthritis, the “Old Man of Chapelle” would have walked upright. The man also had lost most of his teeth and may have had to have been fed by other members of his group.

Read original article here

Ancient case of disease spillover discovered in Neanderthal man who got sick butchering raw meat

Researchers were reexamining the fossilized bones of a Neanderthal who was found in a cave near the French village of La Chapelle-aux-Saints in 1908. The “Old Man of La Chapelle,” as he became known, was the first relatively complete Neanderthal skeleton to be unearthed and is one of the best studied.

More than a century after his discovery, his bones are still yielding new information about the lives of Neanderthals, the heavily built Stone Age hominins that lived in Europe and parts of Asia before disappearing about 40,000 years ago.

The man, thought to be in his late 50s or 60s when he died about 50,000 years ago, had advanced osteoarthritis in his spinal column and hip joint, a study from 2019 had confirmed.

However, during that reanalysis, Dr. Martin Haeusler — a specialist in internal medicine and head of the University of Zurich’s Evolutionary Morphology and Adaptation Group at the Institute of Evolutionary Medicine — realized that not all the changes in the bones could be explained by the wear and tear of osteoarthritis.

“Rather, we found that some of these pathological changes must be due to inflammatory processes,” he said.

“A comparison of the entire pattern of the pathological changes found in the La Chapelle-aux-Saints skeleton with many different diseases led us then to the diagnosis of brucellosis.”

The study with those findings was published in the journal Scientific Reports last month.

Zoonotic disease

Brucellosis is a disease that’s still widespread today. Humans generally acquire the disease through direct contact with infected animals, by eating or drinking contaminated animal products, or by inhaling airborne agents, according to the World Health Organization. Most cases are caused by unpasteurized milk or cheese from infected goats or sheep.

It’s also one of the most common zoonotic diseases — illnesses that are transmitted from animals to humans. They include viruses like HIV and the coronavirus that caused the Covid-19 pandemic.

Brucella has a wide range of symptoms, including fever, muscular pain and night sweats, Haeusler said. It can last from a few weeks to many months or even years. Long-term problems resulting from the disease are variable but can include arthritis pain, back pain, inflammation of the testes — which can lead to infertility — and inflammation of the heart valves known as endocarditis, which Haeusler said was the most common cause of death from the disease.

The paper said the case was “the earliest secure evidence of this zoonotic disease in hominin evolution.”

The disease has also been found in Bronze Age Homo sapiens skeletons, which date back to around 5,000 years ago.

Diet

Brucellosis is found in many wild animals today, and Haeusler said that the Neanderthal man likely caught the disease from butchering or cooking an animal that had been hunted as prey. Possible sources include wild sheep, goats, wild cattle, bison, reindeer, hares and marmots — all of which were components of the Neanderthal diet. However, the paper said that the two large animals Neanderthals hunted, mammoths and woolly rhinoceros, were unlikely to be the disease reservoir — at least based on the animals’ living relatives, in which brucellosis has been largely undetected.

Given the man lived to what must have been a very old age for the period, Haeusler suspected that the Neanderthal may have had a milder version of the disease.

The “Old Man of Chapelle” played a significant role in misconceptions about Neanderthals being primitive Stone Age brutes, according to the Smithsonian. More recent research suggests that they were just as smart as we are.

An early reconstruction of the skeleton depicted the man with a slouching posture, bent knees and the head jutted forward. It was only later that scientists realized the skeleton had a deforming kind of osteoarthritis and perhaps was not a typical Neanderthal.

Haeusler said the study he published in 2019 showed that, even with the wear and tear from degenerative osteoarthritis, the “Old Man of Chapelle” would have walked upright. The man also had lost most of his teeth and may have had to have been fed by other members of his group.

Read original article here

A Shockingly Small Percentage of Our DNA Is Uniquely ‘Human’, Study Finds

By now you might have heard the factoid that modern humans share a pretty large chunk of our genomes with bananas. But delving down much deeper, how much of our genome is uniquely Homo sapiens.

 

A new study has suggested that number could be as small as 1.5 percent, with the rest being shared with our ancient relatives such as Neanderthals and Denisovans.

“We generate a map within human genomes of archaic ancestry and of genomic regions not shared with archaic hominins,” the team wrote in their new paper.

“We find that only 1.5 to 7 percent of the modern human genome is uniquely human.”

Untangling what is ours and what came from our ancient kin is a difficult task. How do you tell which genetic variants are due to interbreeding (also called admixing) of Neanderthals and Homo sapiens for example, rather than variants that were passed onto both species from a common ancestor?

The team wanted to create a system that could identify both admixture events as well as this shared inheritance – called incomplete lineage sorting – that would help tell us which regions of our genome are unique to us.

They created an algorithm called SARGE – Speedy Ancestral Recombination Graph Estimator – so they could map how our genes have weaved through time and species, separating and joining back together at different points using something called ancestral recombination graphs.

 

They ran SARGE on 279 modern human genomes from Africa and elsewhere, two high-quality Neanderthal genomes, and one high-quality Denisovan genome.

“Using the resulting ancestral recombination graph, we map Neanderthal and Denisovan ancestry, incomplete lineage sorting, and the absence of both across modern human genomes,” the team wrote.

“We find evidence of at least one wave of Neanderthal admixture into the ancestors of all non-Africans.”

Along with the 1.5 to 7 percent of the genome that’s unique to modern humans, they also found “evidence of multiple bursts of adaptive changes specific to modern humans within the past 600,000 years involving genes related to brain development and function”.

The researchers explain that most of those genes that were uniquely ours were not genes with unknown functions, instead they were well known genes which coded for proteins used in the brain.

Obviously, this is not even close to the end of the story. For starters, between 1.5 and 7 percent is a pretty large range and the team think they can make it more specific with more genomes and more research.

There have also been plenty of other analyses looking at the percentage of DNA we take from our ancient cousins, so it’s unlikely this will be the last word on the matter.

 

Plus, SARGE isn’t able to tell the researchers why those bursts of adaptive changes happened when they did.

However, the team already has some ideas.

“It’s extremely tempting to speculate that one or more of these bursts had something to do the incredibly social behavior humans have – mediated in large part by our expert control of speech and language,” University of California, Santa Cruz paleogeneticist and one of the researchers, Richard Green, told Business Insider.

The research has been published in Science Advances.

 

Read original article here

An ancestral recombination graph of human, Neanderthal, and Denisovan genomes

INTRODUCTION

Much of the current genetic variation within humans predates the split, estimated at 520 to 630 thousand years (ka) ago (1), between the populations that would become modern humans and Neanderthals. The shared genetic variation present in our common ancestral population is still largely present among humans today and was present in Neanderthals up until the time of their extinction. This phenomenon, which is known as incomplete lineage sorting (ILS), means that any particular human will share many alleles with a Neanderthal that are not shared with some other humans. Therefore, humans often share genetic variation with Neanderthals not by admixture but rather by shared inheritance from a population ancestral to us both. Because of this, any effort to map ancestry from archaic hominins in human genomes must disentangle admixture from ILS. Furthermore, a technique able to identify both admixture and ILS could produce a catalog of uniquely human genomic regions that is free of both and thereby shed light on the evolutionary processes that have been important in our origin as a unique species.

Ancestral recombination graph (ARG) inference (2) is a powerful starting point for such an analysis. An ARG can be conceptualized as a series of trees, mapped to individual sites, over phased haplotypes (chromosomes) in a panel of genomes. Ancestral recombination events, or sites at which chromosome segments with different histories were joined together by historical recombination, form boundaries between trees. Each ancestral recombination event manifests as a clade of haplotypes, all of which descend from the first ancestral haplotype to have it, moving from one position in the tree upstream of the event to a new position in the downstream tree (3). ARGs are complete descriptions of phylogenomic datasets and present for recombining genomes what single trees present for nonrecombining genomes, i.e., a complete description of their genetic relationships. As prior techniques for ancestry mapping can be thought of as summaries of the ARG, higher resolution ancestry maps could be produced if the ARG were known. In addition, the ARG can be used to estimate the time to most recent common ancestor (TMRCA) between admixed and admixer haplotypes, providing additional information about historical admixture between humans and archaic hominins.

Given the utility of an ARG, it is expected that several methods have been devised for estimating ARGs from genetic data. These published approaches all have different strengths and limitations. BEAGLE (3), ArgWeaver (4), and Rent+ (5) were designed for small datasets and require substantial time and/or memory to be used with large sequencing panels. Margarita (6) randomly samples histories at ancestral recombination event boundaries and does not seek to produce parsimonious recombination histories (6). ArgWeaver (4), which is widely considered the gold standard in ARG inference, requires prior knowledge of demographic model parameters. Relate (7) is a relatively new method that scales well to large datasets and produces trees without polytomies and with branch lengths but, in doing so, necessarily samples some relationships that are not directly inferred from the data, as do several other methods (4, 5). The most computationally efficient approach, tsinfer (8), also scales to large datasets but assumes that frequency of an allele is correlated with its age. Since this assumption is violated at loci undergoing either admixture or selection, tsinfer is not well suited for ARG inference using genetic data from Neanderthals, Denisovans, and modern humans.

Here, we present a heuristic, parsimony-guided ARG inference algorithm called SARGE (Speedy Ancestral Recombination Graph Estimator) and use it to build a genome-wide ARG of both modern human and archaic hominin genomes. SARGE can run on thousands of phased genomes, makes no prior assumptions other than parsimony, heuristically estimates branch lengths, and avoids making inferences about unobserved relationships by leaving polytomies in output trees. We validate SARGE using simulated data and demonstrate that it has high specificity compared to existing methods in reconstructing the topology of trees, making it suitable for identifying archaic admixture segments. To achieve this high specificity, SARGE avoids describing some relationships in output trees, resulting in lower sensitivity than existing methods.

We run SARGE on a panel of 279 modern human genomes, two high-coverage Neanderthal genomes, and one high-coverage Denisovan genome. Using the resulting ARG, we map Neanderthal and Denisovan ancestry, ILS, and the absence of both across modern human genomes. We find evidence of at least one wave of Neanderthal admixture into the ancestors of all non-Africans. We also identify several long and deeply divergent Neanderthal haplotype blocks that are specific to some human populations. We find support for the hypothesis that Denisovan-like ancestry is the result of multiple introgression events from different source populations (9, 10). We also detect an excess of Neanderthal and Denisovan haplotype blocks unique to South Asian genomes. Last, we pinpoint human-specific changes likely to have been affected by selection since the split with archaic hominins, many of which are involved in brain development.

RESULTS

ARG algorithm

To build an ARG containing both modern human and archaic hominin genomes without the use of a demographic model or the need to infer ancestral haplotypes, we developed a parsimony-based ARG inference technique, SARGE. SARGE uses both shared derived alleles and inferred, shared ancestral recombination events to articulate trees (Fig. 1A and Supplementary Methods). SARGE uses the four-gamete test (11) to determine regions of recombination and the affected haplotypes. The crux of SARGE is a fast algorithm for choosing the branch movement(s) capable of explaining the highest number of discordant clades across a genomic segment that fails the four-gamete test. Once the branch movements, i.e., inferred ancestral recombinations, are determined, further definition of clades is possible. Thus, the trees are articulated by both shared alleles and shared ancestral recombination events (figs. S1 and S2 and Supplementary Methods). SARGE infers branch lengths via a heuristic method, compensating for mutation rate variation across the genome by comparing the number of mutations on each branch to the divergence to an outgroup genome in a fixed width region around each site (fig. S7 and Supplementary Methods).

Fig. 1 Data structure and performance of SARGE on simulated data.

(A) Schematic of data structure. Top: Rectangles are “tree nodes” representing clades in trees. Each clade has member haplotypes (shown with letters A to G) and a start and end coordinate (blue numbers in brackets) determined by coordinates of single-nucleotide polymorphism (SNP) sites tagging the clade (yellow numbers in braces), along with a propagation distance parameter (100 in this example). Parent/child edges (vertical arrows) also have start and end coordinates determined by the nodes. Ovals are candidates for clades sharing an ancestral recombination event that can explain four-gamete test failures; colored edges indicate potential paths between tree nodes through candidate nodes that could explain four-gamete test failures (colors indicate types of paths). The candidate node with the most edges (here, AB) is eventually chosen as the most parsimonious branch movement, allowing for the inference of new nodes. The two trees at the bottom show the “solved” ancestral recombination event with the branch movement marked in red and all clades inferred without SNP data marked with yellow stars (haplotypes A and B share an ancestral recombination event; their ancestry is shared with haplotypes C, D, and G upstream of the recombination event and haplotype E downstream of it). The coordinates of the recombination event (blue numbers in brackets) are taken to be midway between the highest-coordinate upstream site (left side) and the lowest-coordinate downstream site (right side) involved in recombination. For a more detailed overview of the data structure, see figs. S3 to S5. (B) Accuracy of SARGE on simulated data (defined as percent of all clades correct according to the true ARG in the simulation), with increasing numbers of human-like haplotypes from an unstructured population. Error bars are one SD across five replicates. (C) Number of nodes per tree with increasing number of haplotypes in simulated data.

” data-icon-position=”” data-hide-link-title=”0″>

Fig. 1 Data structure and performance of SARGE on simulated data.

(A) Schematic of data structure. Top: Rectangles are “tree nodes” representing clades in trees. Each clade has member haplotypes (shown with letters A to G) and a start and end coordinate (blue numbers in brackets) determined by coordinates of single-nucleotide polymorphism (SNP) sites tagging the clade (yellow numbers in braces), along with a propagation distance parameter (100 in this example). Parent/child edges (vertical arrows) also have start and end coordinates determined by the nodes. Ovals are candidates for clades sharing an ancestral recombination event that can explain four-gamete test failures; colored edges indicate potential paths between tree nodes through candidate nodes that could explain four-gamete test failures (colors indicate types of paths). The candidate node with the most edges (here, AB) is eventually chosen as the most parsimonious branch movement, allowing for the inference of new nodes. The two trees at the bottom show the “solved” ancestral recombination event with the branch movement marked in red and all clades inferred without SNP data marked with yellow stars (haplotypes A and B share an ancestral recombination event; their ancestry is shared with haplotypes C, D, and G upstream of the recombination event and haplotype E downstream of it). The coordinates of the recombination event (blue numbers in brackets) are taken to be midway between the highest-coordinate upstream site (left side) and the lowest-coordinate downstream site (right side) involved in recombination. For a more detailed overview of the data structure, see figs. S3 to S5. (B) Accuracy of SARGE on simulated data (defined as percent of all clades correct according to the true ARG in the simulation), with increasing numbers of human-like haplotypes from an unstructured population. Error bars are one SD across five replicates. (C) Number of nodes per tree with increasing number of haplotypes in simulated data.

In the interest of parsimony, our method attempts to infer a set of ancestral recombination events that each explains as many four-gamete test failures as possible. Because the four-gamete test is known to underestimate the true number of ancestral recombination events (12, 13), SARGE will systematically underestimate the true number of ancestral recombination events in a dataset by design. Because of this, SARGE is not well suited to certain tasks, such as the creation of fine-grained recombination maps. We have attempted to mitigate cases where a clade in the ARG should be broken by an unobserved ancestral recombination event, however, by introducing a propagation distance parameter that limits the genomic distance over which each observed clade is allowed to persist (Fig. 1A and Supplementary Methods).

SARGE is scalable to large datasets and achieves higher specificity than many other methods at the cost of lower sensitivity, by leaving uncertainty (polytomies) in the output data. Using simulated data, we find that SARGE runs quickly (figs. S8D and S10), requires little memory, and has 78.93% specificity [95% confidence interval (CI), 78.09 to 78.95%] on average across a range of simulated datasets that include between 50 and 500 haplotypes (see Supplementary Methods). SARGE is at least as specific as alternative techniques (fig. S9, A and C). Conversely, SARGE’s sensitivity (25.36%; 95% CI, 25.32 to 25.40%) is lower than that of other methods (fig. S9, B and D), as SARGE leaves an increasingly large number of polytomies in output trees as the number of input haplotypes increases (Fig. 1, B and C). As expected, SARGE (as well as similar techniques) performs best when the mutation to recombination rate ratio is high, as this makes clades easier to detect (figs. S11 and S12) and suffers slightly in accuracy with increasing amounts of population structure (fig. S13).

We also find that the sensitivity of SARGE can be increased by increasing the propagation distance parameter (see Supplementary Text and fig. S15), that missing clades are likely to be small clades that are likely to be close to the leaves of trees (see Supplementary Text and fig. S16), and that incorrectly inferred clades tend to be within a few kilobases of sites at which those clades exist in truth (see Supplementary Text and fig. S17). We also find, using simulated data, that the SARGE’s branch lengths do not appear to be systematically biased upward or downward (see Supplementary Text and fig. S18).

We ran SARGE on 279 phased human genomes from the Simons Genome Diversity Project (SGDP) (14), together with two high-coverage Neanderthal genomes (1, 15) and one high-coverage Denisovan genome (16). In our analyses, we relied on modern human population labels defined by the SGDP for many analyses, but we split sub-Saharan Africans into one population containing only the most deeply diverged lineages (Biaka, Mbuti, and Khomani-San), which we call “Africa-MBK,” and the remaining genomes (“Africa”). Using these data, we find that the completeness of trees in the ARG (the extent to which all possible clades are present rather than in polytomies) is positively correlated with the local mutation rate to recombination rate ratio (fig. S20A; Spearman’s rho = 0.40; P < 2.2 × 10−16) and that the number of inferred ancestral recombination events per genomic window agrees with a previously published population recombination map (17) (fig. S20B; Spearman’s rho = 0.46; P < 2.2 × 10−16), as expected. Estimates of the mean TMRCA of pairs of haplotypes, taken across all trees, were also concordant with prior knowledge (Fig. 2A).

Fig. 2 Performance of SARGE on SGDP and archaic hominin dataset.

(A) Pairwise coalescence times for randomly sampled sets of up to 10 pairs of phased genome haplotypes per population (every possible pair was considered for archaic hominins, since fewer genomes were available). Values are calibrated using a 13-Ma human-chimp divergence time (see Supplementary Methods) and averaged across every variable site in the dataset, error bars show one SD, and branch shortening values for archaic samples were incorporated into calculations using mean values reported in (1). The lower value for humans comes from removing archaic-admixed clades from trees. (B) UPGMA (unweighted pair group method with arithmetic mean) trees computed using nucleotide diversity from SNP data (top and left) against similarity matrix from shared recombination events inferred by SARGE. Light yellow boxes (similar groups) are Native Americans and Papuans. (C) Average similarity between Orcadian haplotypes in the SGDP panel and other European haplotypes calculated on the basis of the number of shared ancestral recombination events. The best matches are in England, Iceland, and Norway, as expected.

” data-icon-position=”” data-hide-link-title=”0″>

Fig. 2 Performance of SARGE on SGDP and archaic hominin dataset.

(A) Pairwise coalescence times for randomly sampled sets of up to 10 pairs of phased genome haplotypes per population (every possible pair was considered for archaic hominins, since fewer genomes were available). Values are calibrated using a 13-Ma human-chimp divergence time (see Supplementary Methods) and averaged across every variable site in the dataset, error bars show one SD, and branch shortening values for archaic samples were incorporated into calculations using mean values reported in (1). The lower value for humans comes from removing archaic-admixed clades from trees. (B) UPGMA (unweighted pair group method with arithmetic mean) trees computed using nucleotide diversity from SNP data (top and left) against similarity matrix from shared recombination events inferred by SARGE. Light yellow boxes (similar groups) are Native Americans and Papuans. (C) Average similarity between Orcadian haplotypes in the SGDP panel and other European haplotypes calculated on the basis of the number of shared ancestral recombination events. The best matches are in England, Iceland, and Norway, as expected.

Using these data, we found SARGE’s inferences of ancestral recombination events to be accurate. Because SARGE articulates tree clades using either shared allelic variation or shared inferred ancestral recombination, it is possible to test the concordance of trees made from each source. On average, 13.2% of clades in the ARG are known only from inference of shared ancestral recombination events and not by the presence of a shared, derived allele. We created a similarity score between every pair of phased human genome haplotypes in our dataset based on how often the haplotypes share ancestral recombination events. This score recapitulates relationships among humans known from single-nucleotide polymorphism (SNP) data alone (Fig. 2, B and C; Pearson’s r2 with scores from SNP data = 0.989; P < 2.2 × 10−16). We note that genomes with the poorest correlation between SNP-based and recombination-based similarity scores to other genomes are those most likely to contain phasing errors (table S1).

Archaic hominin admixture

We used our ARG to find regions of each phased human genome that derived from admixture with archaic hominins (see Supplementary Methods and fig. S24). If humans and the archaic hominins in our panel were in populations that had sorted their lineages, then this exercise would be simple with a complete and correct ARG. However, since human genome regions are often within a clade that includes hominin haplotypes due to ILS, finding admixed segments requires analysis beyond simply finding clades that unite some human and archaic hominin haplotypes.

We started by selecting clades from ARG trees that united some modern humans with archaic hominins to the exclusion of some other modern humans. We then assigned each human genome haplotype in each such clade as putative Neanderthal, Denisovan, or ambiguous ancestry, depending on whether the clade contained Neanderthal, Denisovan, or both types of haplotypes. We then performed several filtering steps to remove these clades likely to result from ILS. First, we removed any clades that included more than 10% of the Africa-MBK haplotypes from the most basal human lineages, which are unlikely to be admixed. We then discarded clades that persisted for a short distance along the chromosome (which likely represent older haplotypes broken down over time by recombination) or in which the TMRCA between modern humans and archaic hominins was high (see Supplementary Methods and fig. S24). This ascertainment strategy was designed to identify haplotype blocks that we could confidently identify as archaic-introgressed and therefore likely underestimated the true extent of admixture across the genome. Because our method relies on both the haplotype block length and the TMRCA between admixed and introgressor haplotypes to identify admixed segments, we were able to identify some haplotypes that resemble archaic admixture in modern humans but that have relatively high sequence divergence to published archaic genomes (manifesting as high TMRCAs between archaic and modern genomes within these segments).

Using the resulting maps, we calculated genome-wide percent admixture estimates across populations and compared them to

estimates based on the population-wide D statistic (18, 19) using basal Africa-MBK lineages as an outgroup. Since our alleles were polarized relative to the chimpanzee genome and only sites with derived alleles present in hominins were considered, our calculations were of the form D(Africa-MBK, Test, Introgressor, Chimpanzee) / D(Africa-MBK, Introgressor1, Introgressor2, Chimpanzee), where Introgressor1 and Introgressor2 were randomly chosen subsets of half of the introgressor (Neanderthal or Denisovan) haplotypes and the derived allele frequency in chimpanzee was set to 0 at all sites in our dataset. ARG-based estimates are similar to, but lower than, D statistic–based estimates in all non-African genomes, which we expected because of our aggressive filtering strategy for eliminating ILS (see Supplementary Methods). We detected slightly more admixture in sub-Saharan Africans (excluding Africa-MBK) than using the D statistic (Fig. 3A), even when considering the lower end of 95% confidence weighted block jackknife intervals (table S2). We note that a recent study that used an outgroup-free method to detect Neanderthal ancestry blocks in human genomes also found a higher average amount of Neanderthal ancestry in African genomes than has been previously reported (20). As another quality check, we compared our maps of Neanderthal ancestry to those published in prior studies (2023). We found that maps produced by SARGE are about as concordant with the published maps as the published maps are with each other (fig. S25).

Fig. 3 Neanderthal ancestry inferred in modern humans.

(A) Genome-wide percent Neanderthal, Denisovan, and ambiguous (either Neanderthal or Denisovan) across SGDP populations, using the ARG and an estimator based on the D statistic. D statistic calculations considered only one archaic population at a time as introgressor and thus do not detect ambiguous ancestry and also might count some Denisovan ancestry as Neanderthal and vice versa. (B) For individual phased human genome haplotypes (points), mean TMRCA with Neanderthal in Neanderthal haplotype blocks (y axis) and mean Neanderthal haplotype block length (x axis) for all Neanderthal-introgressed haplotypes. TMRCA calculations assume a total of 6.5-Ma human-chimpanzee divergence and branch shortening values from (1), with a mutation rate of 1 × 10−9 per site per year. bp, base pair. (C) Overall number Neanderthal haplotype blocks versus geographically restricted (unique to a 3000-km radius) Neanderthal haplotype blocks. (D) Same as (B), but limited to geographically restricted (unique to a 3000-km radius) Neanderthal haplotype blocks. Only haplotypes with more than 10 geographically restricted segments are shown.

” data-icon-position=”” data-hide-link-title=”0″>

Fig. 3 Neanderthal ancestry inferred in modern humans.

(A) Genome-wide percent Neanderthal, Denisovan, and ambiguous (either Neanderthal or Denisovan) across SGDP populations, using the ARG and an estimator based on the D statistic. D statistic calculations considered only one archaic population at a time as introgressor and thus do not detect ambiguous ancestry and also might count some Denisovan ancestry as Neanderthal and vice versa. (B) For individual phased human genome haplotypes (points), mean TMRCA with Neanderthal in Neanderthal haplotype blocks (y axis) and mean Neanderthal haplotype block length (x axis) for all Neanderthal-introgressed haplotypes. TMRCA calculations assume a total of 6.5-Ma human-chimpanzee divergence and branch shortening values from (1), with a mutation rate of 1 × 10−9 per site per year. bp, base pair. (C) Overall number Neanderthal haplotype blocks versus geographically restricted (unique to a 3000-km radius) Neanderthal haplotype blocks. (D) Same as (B), but limited to geographically restricted (unique to a 3000-km radius) Neanderthal haplotype blocks. Only haplotypes with more than 10 geographically restricted segments are shown.

We next looked for population-specific differences in archaic hominin ancestry in modern humans. Lengths of archaic haplotype segments and the TMRCA to admixer across those segments are both affected by the time of admixture and the divergence between the true admixers and available archaic hominin genomes. We therefore computed the mean of these two values for each ancestry type per phased genome haplotype and compared them across individuals from different populations to look for evidence of distinctive, population-specific admixture events. This analysis revealed distinctive population-specific patterns for Neanderthal and Denisovan ancestry, and many pairwise comparisons of these values between populations are significant (table S3). Segments of ambiguous ancestry produce a pattern resembling a mixture of Neanderthal and Denisovan ancestry, as expected (Fig. 3, B to D, and figs. S26, S28, S37, and S38). We caution, however, that our approach can artificially shorten haplotype block lengths (see Supplementary Methods and figs. S21 and S22), especially for populations such as Papuans and Australians that were absent from the 1000 Genomes Project panel (24) that was used for phasing (14). Nonetheless, Neanderthal haplotype block lengths in Oceania are not significantly shorter than in other populations (Fig. 3B), and incorrect phasing in archaic genomes does not appear to negatively affect results of admixture scans using simulated data (see Supplementary Text).

As expected, the ARG classifies a smaller fraction (0.096 to 0.46%) of sub-Saharan African genomes (excluding Mozabite and Saharawi individuals) as resulting from Neanderthal admixture compared to non-African genomes (0.73 to 1.3%). The haplotype segments of African genomes that are grouped together in clades with Neanderthal haplotypes are distinctive from the haplotype segments found in the genomes of people with non-African ancestry (Fig. 3B and fig. S26A). Namely, the African haplotypes are more dissimilar to the Neanderthal haplotypes with which they are grouped and tend to be shorter. These observations are qualitatively consistent with the model wherein genetic drift may group Neanderthal and African haplotypes, independent of a specific admixture event. It is also possible that these haplotypes are the result of true introgression events from unknown archaic hominins distantly related to the Neanderthal/Denisovan lineage (25). Another recent study using an inferred ARG also found mysterious, divergent haplotypes within sub-Saharan Africans that resembled unknown archaic introgression (7).

Unexpectedly, however, two of the SGDP African populations, Masai and Somali, are intermediate between non-African and African genomes when measuring lengths of archaic haplotype segments and TMRCA to admixers within them (Fig. 3B). These Neanderthal haplotype blocks may have originated in ancient European migrants to eastern Africa (26) and spread beyond eastern Africa through gene flow, which is known to have affected even the basal Africa-MBK lineages (27).

To test this hypothesis, we recomputed the mean length and TMRCA of admixer genomes within archaic-introgressed haplotype segments across all individuals, using only geographically restricted segments. We defined these as any archaic haplotype segments found only in genomes that were sampled within a 3000-km radius of each other (using geodesic distance between sampling coordinates). This analysis showed Masai and Somali genomes to have a comparable number of geographically restricted Neanderthal haplotypes to most other African genomes (Fig. 3C), with similar haplotype block lengths and TMRCAs to admixers within these geographically restricted haplotypes (Fig. 3D). This observation is concordant with the idea that the unusual Neanderthal haplotypes in these populations originated in Eurasian migrants.

Outside of Africa, our Neanderthal introgression maps largely agree with prior studies. We detect a mean TMRCA to Neanderthal of about 74 ka ago across all Neanderthal haplotype blocks in non-African populations, using published corrections for branch shortening in the archaic genomes (1). This value is consistent with published estimates of Neanderthal admixture times and the phylogenetic distance between the Vindija33.19 Neanderthal and the introgressing Neanderthal calculated by a recent study (28). The mean TMRCA between genomic segments detected as Neanderthal admixture segments and the Neanderthal itself is consistent within about 5000 years for all populations outside of Africa (Fig. 3B). We see slightly more Neanderthal ancestry in Central Asia, East Asia, and the Americas than in Europe, South Asia, and Southwest Asia (Fig. 3A). We also find more geographically restricted Neanderthal haplotype blocks in South Asia than elsewhere in mainland Eurasia, and the fewest geographically restricted Neanderthal haplotype blocks in the Americas (Fig. 3C and fig. S35).

Humans in Central and East Asia are known to have elevated Neanderthal ancestry compared to other populations (22). However, there is debate over whether this elevated Neanderthal ancestry is due to smaller past population size relative to other groups and the resulting stronger effect of genetic drift (22) or to additional pulses of Neanderthal admixture specific to these populations (9, 29). Although we detect more Neanderthal ancestry in Central and East Asians than in West Eurasians, we detect a similar number of geographically restricted haplotype blocks (unique to a 3000-km radius) in both groups (Fig. 3C). Further, Neanderthal haplotype blocks are shorter on average and therefore older in Central and East Asians than in West Eurasians (Fig. 3B). This implies that the excess Neanderthal ancestry in Central and East Asians mostly comprises broadly shared haplotype blocks from introgression common to all non-Africans, consistent with the drift scenario. Another recent study (20) suggested that excess Neanderthal ancestry in Central and East Asians could be mostly due to underestimating Neanderthal ancestry in West Eurasians as a result of using sub-Saharan Africans, who share some Neanderthal ancestry with West Eurasians, as a model unadmixed outgroup. We reject this explanation because removing Neanderthal haplotypes from West Eurasians would likely increase the number of geographically restricted Neanderthal haplotypes in Asians, contrary to our observation. Furthermore, South Asians appear to have a comparable amount of Neanderthal ancestry to West Eurasians (Fig. 3, B and C), despite sharing few Neanderthal haplotype blocks with Africans relative to West Eurasians (fig. S37A). Last, because our strategy for ascertaining Neanderthal haplotype used only Mbuti, Biaka, and Khomani-San genomes as an outgroup and allowed up to 10% of these genomes to have Neanderthal ancestry, we do not believe our results were significantly biased by our choice of outgroup. This is further demonstrated by the fact that we detect a non-negligible amount of Neanderthal ancestry in all African groups (table S2).

Aside from these broadly shared haplotype blocks, we also observe geographically restricted Neanderthal haplotype blocks in each non-African population in our panel. These population-specific haplotype blocks tend to be longer than the shared haplotype blocks and to have an older TMRCA to the Neanderthal genome than the broadly shared haplotype blocks (Fig. 3D, compared to all blocks shown in Fig. 3B). These observations suggest that the population-specific haplotype blocks may be the result of more recent population-specific Neanderthal admixture, as has recently been suggested (2628, 30).

We next investigated population-specific patterns within Denisovan ancestry segments and found that these segments probably originate from admixture with multiple, divergent individuals that were distantly related to the Denisovan genome. This implies that the Denisovan genome is not a good model for the actual population that admixed with humans with “Denisovan” ancestry. Prior studies have suggested that Denisovan-like haplotype blocks in humans have two or three distinct sources with different levels of divergence to the Denisovan genome, with the best-matching haplotype blocks in East Asia (9, 10). We uncover the same signal: Geographically restricted Denisovan haplotype blocks have the lowest TMRCA to the Denisovan genome in East Asian genomes (mean TMRCA to Denisovan of 125 ka ago) (figs. S26 and S27).

Unexpectedly, we detected many Neanderthal and Denisovan-like haplotype blocks that are unique to South Asia (Fig. 3C and figs. S26C, S45, S35, and S36) and many Neanderthal haplotype blocks that are unique to Oceania (Fig. 3C and figs. S44 and S35). These geographically restricted Neanderthal haplotype blocks are no more divergent to the Neanderthal genome than those specific to other populations (Fig. 3D), complicating any interpretation of these regions.