Tag Archives: genome

Scientists publish the first complete human genome

The face of a girl is seen daubed in colours as she takes part in Holi celebrations in Ahmedabad, India, March 18, 2022. REUTERS/Amit Dave

Register now for FREE unlimited access to Reuters.com

Register

WASHINGTON, March 31 (Reuters) – Scientists on Thursday published the first complete human genome, filling in gaps remaining after previous efforts while offering new promise in the search for clues regarding disease-causing mutations and genetic variation among the world’s 7.9 billion people.

Researchers in 2003 unveiled what was then billed as the complete sequence of the human genome. But about 8% of it had not been fully deciphered, mainly because it consisted of highly repetitive chunks of DNA that were difficult to mesh with the rest.

A consortium of scientists resolved that in research published in the journal Science. The work was initially made public last year before its formal peer review process.

Register now for FREE unlimited access to Reuters.com

Register

“Generating a truly complete human genome sequence represents an incredible scientific achievement, providing the first comprehensive view of our DNA blueprint,” Eric Green, director of the National Human Genome Research Institute (NHGRI), part of the U.S. National Institutes of Health, said in a statement.

“This foundational information will strengthen the many ongoing efforts to understand all the functional nuances of the human genome, which in turn will empower genetic studies of human disease,” Green added.

The consortium’s full version is composed of 3.055 billion base pairs, the units from which chromosomes and our genes are built, and 19,969 genes that encode proteins. Of these genes, the researchers identified about 2,000 new ones. Most of those are disabled, but 115 may still be active. The scientists also spotted about 2 million additional genetic variants, 622 of which were present in medically relevant genes.

The consortium was dubbed Telomere-to-Telomere (T2T), named after the structures found at the ends of all chromosomes, the threadlike structure in the nucleus of most living cells that carries genetic information in the form of genes.

“In the future, when someone has their genome sequenced, we will be able to identify all of the variants in their DNA and use that information to better guide their healthcare,” Adam Phillippy, one of the leaders of T2T and a senior investigator at NHGRI, said in a statement.

“Truly finishing the human genome sequence was like putting on a new pair of glasses. Now that we can clearly see everything, we are one step closer to understanding what it all means,” Phillippy added.

Among other things, the new DNA sequences provided fresh detail about the region around what is called the centromere, where chromosomes are grabbed and pulled apart when cells divide to ensure that each “daughter” cell inherits the proper number of chromosomes.

“Uncovering the complete sequence of these formerly missing regions of the genome told us so much about how they’re organized, which was totally unknown for many chromosomes,” Nicolas Altemose, a postdoctoral fellow at the University of California, Berkeley, said in a statement.

Register now for FREE unlimited access to Reuters.com

Register

Reporting by Will Dunham, Editing by Rosalba O’Brien

Our Standards: The Thomson Reuters Trust Principles.

Read original article here

Scientists sequence the complete human genome for the first time

“Having this complete information will allow us to better understand how we form as an individual organism and how we vary not just between other humans but other species,” Evan Eichler, a Howard Hughes Medical Institute investigator at the University of Washington and the research leader, said Thursday.

The new research introduces 400 million letters to the previously sequenced DNA — an entire chromosome’s worth. The full genome will allow scientists to analyze how DNA differs between people and whether these genetic variations play a role in disease.

Until now, it was unclear what these unknown genes coded.

“It turns out that these genes are incredibly important for adaptation,” Eichner said. “They contain immune response genes that help us to adapt and survive infections and plagues and viruses. They contain genes that are … very important in terms of predicting drug response.”

Eichner also said that some of the recently uncovered genes are even responsible for making human brains larger than those of other primates, providing insight into what makes humans unique.

This remaining 8% of the human genome had stumped scientists for years because of its complexities. For one thing, it contained DNA regions with several repetitions, which made it challenging to string the DNA together in the correct order using previous sequencing methods.

The researchers relied on two DNA sequencing technologies that emerged over the past decade to bring this project to fruition: the Oxford Nanopore DNA sequencing method, which can sequence up to 1 million DNA letters at once but with some mistakes, and the PacBio HiFi DNA sequencing method, which can read 20,000 letters with 99.9% accuracy.

Sequencing DNA is like solving a jigsaw puzzle, Eichner said. Scientists must first break the DNA into smaller parts and then use sequencing machines to piece it together in the correct order. Previous sequencing tools could sequence only small sections of DNA at once.

With a 10,000-piece puzzle, it’s hard to correctly arrange small puzzle pieces when they look alike, much like it is to sequence small sections of repetitive DNA. But with a 500-piece puzzle, it’s much easier to arrange larger pieces — or, in this case, longer segments of DNA.

A second challenge was finding cells that contained only one genome.

Standard human cells contain two sets of DNA, a maternal copy and a paternal copy, but this team used DNA from a group of cells called a complete hydatidiform mole, which contains a duplicate of the paternal set of DNA. A complete hydatidiform mole is a rare complication of a pregnancy caused by the abnormal growth of cells that originate from the placenta. This approach simplifies the genome so that scientists need sequence only one set rather than two sets of DNA.

Because the research team used a duplicate set of DNA, the scientists were unable to sequence the Y chromosome originally. According to lead study author Adam Phillippy, the team has managed to sequence the Y chromosome using a different set of cells.

A complete set of 24 sequenced chromosomes is available on the University of Santa Cruz genome browser.
Decoding this gapless sequence has a high price. Phillippy, who is also head of the gene informatics section at the National Human Genome Research Institute, said that altogether, the project cost a few million dollars or more. But that’s a fraction of the almost $450 million that it cost the Human Genome Project to achieve its final sequence in 2003. And with new technology, sequencing is only getting cheaper.

For now, it’s still too costly and time-consuming for everyone to sequence their own genome. But research is underway that uses this genome to identify whether certain genetic differences are linked with specific cancers. Knowing the genetic variations could also allow doctors to better tailor treatments, said Michael Schatz, another researcher on the team and a professor of computer science and biology at Johns Hopkins University.

Phillippy said he hopes that within the next 10 years, sequencing individuals’ genomes can become a routine medical test that costs less than $1,000. His team continues to work toward that goal.

Charles Rotimi, scientific director of the National Human Genome Research Institute, said in a statement that this scientific achievement is “moving us closer to individualized medicine for all humanity.” Rotimi was not involved in the research.

Read original article here

First complete gap-free human genome sequence published | Genetics

More than two decades after the draft human genome was celebrated as a scientific milestone, scientists have finally finished the job. The first complete, gap-free sequence of a human genome has been published in an advance expected to pave the way for new insights into health and what makes our species unique.

Dr Karen Miga, a scientist at the University of California, Santa Cruz who co-led the international consortium behind the project, said: “These parts of the human genome that we haven’t been able to study for 20-plus years are important to our understanding of how the genome works, genetic diseases, and human diversity and evolution.”

Until now, about 8% of the human genome was missing, including large stretches of highly repetitive sequences, sometimes described as “junk DNA”. In reality though, these repeated sections were omitted due to technical difficulties in sequencing them, rather than pure lack of interest.

Sequencing a genome is something like slicing up a book into snippets of text then trying to reconstruct the book by piecing them together again. Stretches of text that contain a lot of common or repeated words and phrases would be harder to put in their correct place than more unique pieces of text. New “long-read” sequencing techniques that decode big chunks of DNA at once – enough to capture many repeats – helped overcome this hurdle.

Scientists were able to simplify the puzzle further by using an unusual cell type that only contains DNA inherited from the father (most cells in the body contain two genomes – one from each parent). Together these two advances allowed them to decode the more than 3bn letters that comprise the human genome.

“In the future, when someone has their genome sequenced, we will be able to identify all of the variants in their DNA and use that information to better guide their healthcare,” said Dr Adam Phillippy, of the National Human Genome Research Institute in Maryland and co-chair of the consortium. “Truly finishing the human genome sequence was like putting on a new pair of glasses. Now that we can clearly see everything, we are one step closer to understanding what it all means.”

One area of interest is that the parts of the genome with many repeated stretches include those where most of human genetic variation is found. Variability within these regions may also provide crucial clues to how our human ancestors underwent rapid evolutionary changes that led to more complex cognition.

The work is also likely to lead to a better understanding of enigmatic components of the genome known as centromeres. They are dense bundles of DNA that hold chromosomes together and play a role in cell division, but until now had been considered unmappable because they contain thousands of stretches of DNA sequences that repeat over and over.

The science behind the sequencing effort and some initial analysis of the new genome regions are outlined in six papers published in the journal Science.

“Opening up these new parts of the genome, we think there will be genetic variation contributing to many different traits and disease risk,” said Rajiv McCoy, of Johns Hopkins University and a participant in the Telomere to Telomere (T2T) consortium. “There’s an aspect of this that’s like, we don’t know yet what we don’t know.”

Read original article here

Scientists say they can read nearly the whole genome of an IVF-created embryo | Science

A California company says it can decipher almost all the DNA code of a days-old embryo created through in vitro fertilization (IVF)—a challenging feat because of the tiny volume of genetic material available for analysis. The advance depends on fully sequencing both parents’ DNA and “reconstructing” an embryo’s genome with the help of those data. And the company suggests it could make it possible to forecast risk for common diseases that develop decades down the line. Currently, such genetic risk prediction is being tested in adults, and sometimes offered clinically. The idea of applying it to IVF embryos has generated intense scientific and ethical controversy. But that hasn’t stopped the technology from galloping ahead.

Heart conditions, autoimmune diseases, cancer, and many other adult ailments have complex and often mysterious origins, fueled by a mix of genetic and environmental influences. Hundreds of variations in the human genome can collectively raise or lower risk of a particular disease, sometimes by a lot. Predicting a person’s chance of a specific illness by blending this genetic variability into what’s called a “polygenic risk score” remains under study in adults, in part because our understanding of how gene variants come together to drive or protect against disease remains a work in progress. In embryos it’s even harder to prove a risk score’s accuracy, researchers say. “Ultimately, how are we going to validate this in embryos?” says Norbert Gleicher, an infertility specialist at the Center for Human Reproduction in New York City who was not involved in the research. “We’ll have to wait for 40 or 50 years” to find out whether a person develops the diseases they were screened for as an embryo.

With current technologies, it’s very difficult to accurately sequence a whole genome from just a few cells, though some have tried with different methods. The new work on polygenic risk scores for IVF embryos is “exploratory research,” says Premal Shah, CEO of MyOme, the company reporting the results. Today in Nature Medicine, the MyOme team, led by company co-founders and scientists Matthew Rabinowitz and Akash Kumar, along with colleagues elsewhere, describe creating such scores by first sequencing the genomes of 10 pairs of parents who had already undergone IVF and had babies. The researchers then used data collected during the IVF process: The couples’ embryos, 110 in all, had undergone limited genetic testing at that time, a sort of spot sequencing of cells, called microarray measurements. Such analysis can test for an abnormal number of chromosomes, certain genetic diseases, and rearrangements of large chunks of DNA, and it has become an increasingly common part of IVF treatment in the United States. By combining these patchy embryo data with the more complete parental genome sequences, and applying statistical and population genomics techniques, the researchers could account for the gene shuffling that occurs during reproduction and calculate which chromosomes each parent had passed down to each embryo. In this way, they could predict much of that embryo’s DNA.

The researchers had a handy way to see whether their reconstruction was accurate: Check the couples’ babies. They collected cheek swab samples from the babies and sequenced their full genome, just as they’d done with the parents. They then compared that “true sequence” with the reconstructed genome for the embryo from which the child originated. The comparison revealed, essentially, a match: For a 3-day-old embryo, at least 96% of the reconstructed genome aligned with the inherited gene variants in the corresponding baby; for a 5-day-old embryo, it was at least 98%. (Because much of the human genome is the same across all people, the researchers focused on the DNA variability that made the parents, and their babies, unique.)

“What they presented is a nice method to sequence the genomes of all embryos,” says Shai Carmi, a statistical geneticist at the Hebrew University of Jerusalem. Such an accomplishment “is not trivial.” Kumar hopes being able to reconstruct most of an embryo’s genome will provide information well beyond what’s now available to people undergoing IVF, to determine an offspring’s chances of staying healthy. “It’s not enough to focus on the single gene anymore,” he says.

Once they had reconstructed embryo genomes in hand, the researchers turned to published data from large genomic studies of adults with or without common chronic diseases and the polygenic risk score models that were derived from that information. Then, MyOme applied those models to the embryos, crunching polygenic risk scores for 12 diseases, including breast cancer, coronary artery disease, and type 2 diabetes. The team also experimented with combining the reconstructed embryo sequence of single genes, such as BRCA1 and BRCA2, that are known to dramatically raise risk of certain diseases, with an embryo’s polygenic risk scores for that condition—in this case, breast cancer.

“We’re talking about providing information on risks that people care about—heart disease, cancer, autoimmune disease,” says Kumar, who is also a pediatric medical geneticist. He still sees patients and sometimes encounters frustration from parents wanting to avoid conferring a high risk of ailments that run in their families to their offspring. At the same time, Kumar stresses, “This is a new technology. It’s going to have controversies and challenges.”

In fact, many researchers say it’s premature to use polygenic risk scores to select which embryos are transferred. Such risk scores are “primarily still a research tool, even in adults,” says Barbara Koenig, a medical anthropologist who works on bioethics at the University of California, San Francisco. She’s involved in a large study called Women Informed to Screen Depending On Measures of risk that offers some women polygenic risk scores for breast cancer along with screening recommendations. “The scores are constantly being refined, every week they change,” Koenig says. “It’s like a constantly moving target.”

Kumar and his co-authors acknowledge the scores’ limitations, including that they are based on DNA from populations of overwhelmingly European ancestry and may be less accurate in other groups. Because of that, the MyOme team did not create disease risk assessments for embryos whose genome reflected at least 20% Asian or African ancestry. Even the DNA array technologies used to reconstruct the embryonic genomes have a European bias, says Genevieve Wojcik, a genetic epidemiologist at Johns Hopkins University, and may be less reliable for those with non-European ancestry. “You have a tool that cannot be used for a large proportion of the population,” she says. Kumar says the company is working to make the technology more broadly applicable.

There are other concerns, too. Although Carmi says the accuracy of polygenic risk scores in adults has improved, it’s unknown whether scores based on adult DNA and health data translate to embryos, in part because the environment can play a major role in shaping outcomes. “It’s difficult to say whether this will be meaningful,” Carmi says. He and his colleagues have seen this limitation up close: They’ve used computer modeling to assess whether height and IQ can be boosted by selecting embryos using polygenic risk scores for either trait, and found that generally, it doesn’t work. “We’re still missing a lot” when it comes to understanding genetics, even for highly heritable traits such as height, he says. In another computer modeling paper, however, Carmi found certain disease polygenic risk scores in embryos may prove useful. That’s because unlike height, which runs across a spectrum, heart attacks, say, either happen or they don’t. And pulling down genetic risk somewhat by implanting a different embryo, he says, may be enough to avoid that outcome.

But like a painting with only one corner visible, much of the human genome remains shrouded, including how genes interact with each other and the multiple effects one gene may have. Gleicher worries about the unintended consequences of applying polygenic risk scores to embryos. “You can achieve omission of one disease but at the same time, by doing that, induce another disease.” For example, modeling suggests selecting an embryo with a high polygenic risk score for educational attainment could also increase its risk for bipolar disorder. In December 2021, the European Society of Human Genetics urged against using polygenic risk scores for embryo selection—a position firmly endorsed by Gleicher, who calls such practice “unethical.”

Still, some companies and fertility clinics already claim they can help parents select embryos for IQ and risk of various diseases. MyOme, meanwhile, is applying the methods from this latest study to another that’s ongoing, working with IVF clinics and couples who want to learn polygenic risk scores for their frozen embryos. Couples may opt to decide which embryos to implant based on that information. “When you have a lot of information presented in this context, is it going to provide empowerment, or is it just going to confuse the parents?” Kumar asks. That’s one question he hopes this ongoing study can answer.

Kumar says he’s well aware of the criticisms, including that polygenic risk scores may not even be accurate for embryos. “That point is heard,” Kumar says. “Our focus is doing this research because we see promise.”

Read original article here

What You Eat Has The Power to Reprogram Your Genes. An Expert Explains How

People typically think of food as calories, energy, and sustenance. However, the latest evidence suggests that food also “talks” to our genome, which is the genetic blueprint that directs the way the body functions down to the cellular level.

 

This communication between food and genes may affect your health, physiology, and longevity. The idea that food delivers important messages to an animal’s genome is the focus of a field known as nutrigenomics.

This is a discipline still in its infancy, and many questions remain cloaked in mystery. Yet already, we researchers have learned a great deal about how food components affect the genome.

I am a molecular biologist who researches the interactions among food, genes, and brains in the effort to better understand how food messages affect our biology. The efforts of scientists to decipher this transmission of information could one day result in healthier and happier lives for all of us.

But until then, nutrigenomics has unmasked at least one important fact: Our relationship with food is far more intimate than we ever imagined.

The interaction of food and genes

If the idea that food can drive biological processes by interacting with the genome sounds astonishing, one need look no further than a beehive to find a proven and perfect example of how this happens. Worker bees labor nonstop, are sterile, and live only a few weeks.

The queen bee, sitting deep inside the hive, has a life span that lasts for years and a fecundity so potent she gives birth to an entire colony.

 

And yet, worker and queen bees are genetically identical organisms. They become two different life forms because of the food they eat. The queen bee feasts on royal jelly; worker bees feed on nectar and pollen.

Both foods provide energy, but royal jelly has an extra feature: its nutrients can unlock the genetic instructions to create the anatomy and physiology of a queen bee.

So how is food translated into biological instructions? Remember that food is composed of macronutrients. These include carbohydrates – or sugars – proteins and fat.

Food also contains micronutrients such as vitamins and minerals. These compounds and their breakdown products can trigger genetic switches that reside in the genome.

Like the switches that control the intensity of the light in your house, genetic switches determine how much of a certain gene product is produced. Royal jelly, for instance, contains compounds that activate genetic controllers to form the queen’s organs and sustain her reproductive ability.

In humans and mice, byproducts of the amino acid methionine, which are abundant in meat and fish, are known to influence genetic dials that are important for cell growth and division.

 

And vitamin C plays a role in keeping us healthy by protecting the genome from oxidative damage; it also promotes the function of cellular pathways that can repair the genome if it does get damaged.

Depending on the type of nutritional information, the genetic controls activated and the cell that receives them, the messages in food can influence wellness, disease risk, and even life span. But it’s important to note that to date, most of these studies have been conducted in animal models, like bees.

Interestingly, the ability of nutrients to alter the flow of genetic information can span across generations. Studies show that in humans and animals, the diet of grandparents influences the activity of genetic switches and the disease risk and mortality of grandchildren.

Cause and effect

One interesting aspect of thinking of food as a type of biological information is that it gives new meaning to the idea of a food chain. Indeed, if our bodies are influenced by what we have eaten – down to a molecular level – then what the food we consume “ate” also could affect our genome.

For example, compared to milk from grass-fed cows, the milk from grain-fed cattle has different amounts and types of fatty acids and vitamins C and A . So when humans drink these different types of milk, their cells also receive different nutritional messages.

 

Similarly, a human mother’s diet changes the levels of fatty acids as well as vitamins such as B-6, B-12 and folate that are found in her breast milk. This could alter the type of nutritional messages reaching the baby’s own genetic switches, although whether or not this has an effect on the child’s development is, at the moment, unknown.

And, maybe unbeknownst to us, we too are part of this food chain. The food we eat doesn’t tinker with just the genetic switches in our cells, but also with those of the microorganisms living in our guts, skin, and mucosa.

One striking example: In mice, the breakdown of short-chain fatty acids by gut bacteria alters the levels of serotonin, a brain chemical messenger that regulates mood, anxiety, and depression, among other processes.

Food additives and packaging

Added ingredients in food can also alter the flow of genetic information inside cells. Breads and cereals are enriched with folate to prevent birth defects caused by deficiencies of this nutrient.

But some scientists hypothesize that high levels of folate in the absence of other naturally occurring micronutrients such as vitamin B-12 could contribute to the higher incidence of colon cancer in Western countries, possibly by affecting the genetic pathways that control growth.

This could also be true with chemicals found in food packaging. Bisphenol A, or BPA, a compound found in plastic, turns on genetic dials in mammals that are critical to development, growth, and fertility.

For example, some researchers suspect that, in both humans and animal models, BPA influences the age of sexual differentiation and decreases fertility by making genetic switches more likely to turn on.

All of these examples point to the possibility that the genetic information in food could arise not just from its molecular composition – the amino acids, vitamins and the like – but also from the agricultural, environmental, and economic policies of a country, or the lack of them.

Scientists have only recently begun decoding these genetic food messages and their role in health and disease. We researchers still don’t know precisely how nutrients act on genetic switches, what their rules of communication are and how the diets of past generations influence their progeny.

Many of these studies have so far been done only in animal models, and much remains to be worked out about what the interactions between food and genes mean for humans.

What is clear though, is that unraveling the mysteries of nutrigenomics is likely to empower both present and future societies and generations.

Monica Dus, Assistant Professor of Molecular, Cellular, and Developmental Biology, University of Michigan.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

 

Read original article here

We Now Have The Largest Ever Human ‘Family Tree’, With 231 Million Ancestral Lineages

In June 2000, two rival groups of researchers shook hands in the shared success of a milestone in biology – the delivery of a rough draft of the human genome.

What started with an incomplete map of our chromosomes has since bloomed into a vast trove of individualized sequences from all corners of the globe, and in many cases stretching far back in time.

 

Somewhere in that ocean of decoded DNA is a story of our shared humanity.

Unfortunately, reading it is easier said than done. Not only is the sheer mass of data a problem, subtle differences in samples, diverse formats, and analysis techniques prioritizing different kinds of errors all present obstacles to a unified interpretation.

Now researchers from the Big Data Institute (BDI) at the University of Oxford in the UK have made a significant start, by merging a forest of more than 3,600 individual sequences from 215 populations into a single, enormous tree.

The tree’s branches comprise of a mind-blowing 231 million ancestral lineages. At its base is a spread of roots represented by eight ancient, highly detailed human genome sequences, with thousands of smaller snippets used to confirm their place deep in our past.

Among them are three Neanderthal genomes, one genome from a Denisovan, and a small family who lived in Siberia more than four thousand years ago.

“Essentially, we are reconstructing the genomes of our ancestors and using them to form a series of linked evolutionary trees that we call a ‘tree sequence’,” says geneticist Anthony Wilder Wohns, who led the study while completing his doctorate at the BDI.

 

“We can then estimate when and where these ancestors lived.”

Their tree sequence method makes use of what’s known as a succinct data structure – a computing concept that aims to represent data in an optimal amount of space that also limits the amount of time needed to probe it all with questions.

We might apply similar thinking when saving files on our own computer, finding a compromise between compressing documents and squeezing them into long lists of folders, or simply saving everything on the desktop.

In this specific case, a tree sequence finds correlations between different branches of a tree to help make the large pools of information easier to study. 

By turning the data into graphs with nodes representing various lineages and mapping mutations along the edges, massive genetic databases can not only be squeezed into a relatively small space, but can be accessed more easily by algorithms designed to search for interesting statistics.

“The power of our approach is that it makes very few assumptions about the underlying data and can also include both modern and ancient DNA samples,” says Wohns, who further explains their work in the video below.

Incorporating labels on the geographical locations of sequences allowed the team to estimate where certain common ancestors might have once lived and how they moved about.

Not only does this reveal events we already suspect, such as how human populations migrated from Africa, it hints at changes in population densities within ancestral groups we’re still learning about, such as the Denisovans.

 

Thanks to the efficiency of this process, the already impressive tree has plenty of room to grow as more genetic data become available in the future.

Adding millions more genomes will only make any further results more accurate, pinpointing exactly where a novel sequence fits in a genealogy that stretches around the world.

“This genealogy allows us to see how every person’s genetic sequence relates to every other, along all the points of the genome,” says BDI evolutionary geneticist, Yan Wong.

Thinking even bigger, there’s no reason the same approach couldn’t be applied to other species, possibly one day contributing to a global tapestry of life on Earth.

“While humans are the focus of this study, the method is valid for most living things; from orangutans to bacteria,” says Wohns.

“It could be particularly beneficial in medical genetics, in separating out true associations between genetic regions and diseases from spurious connections arising from our shared ancestral history.”

This research was published in Science.

 

Read original article here

‘Human Family Tree’ Includes 27 Million Ancestors

A visualization showing the inferred human ancestral lineages over time and geographical location. Each line represents an ancestral relationship; the line’s width shows the frequency of the relationship. Color indicates the estimated age of the ancestor.
Image: Reproduced, with permission, from Wohns et al., A unified genealogy of modern and ancient genomes. Science (2022). doi: 10.1126/science.abi8264.

A team of scientists has combined modern and ancient genomes to build a new “genealogy of everyone,” in an achievement that sets the groundwork for future studies into our evolution and global spread.

Thousands upon thousands of modern and ancient human genomes have been integrated into a coherent and unified genealogy, according to new research published in Science. It’s akin to a family tree, but it’s a whopper, as it contains nearly 27 million ancestors, making it the largest human genealogy ever created. The new map could be used to study human evolution and even assist with medical research having to do with hereditary diseases.

‘We have basically built a huge family tree, a genealogy for all of humanity that models as exactly as we can the history that generated all the genetic variation we find in humans today,” Yan Wong, an evolutionary geneticist at the Big Data Institute and a co-author of the study, explained in a University of Oxford statement. “This genealogy allows us to see how every person’s genetic sequence relates to every other, along all the points of the genome.”

The network shows how individuals around the world are related to each other, and it predicts common ancestors, including when they lived and where they came from. It also models key events in human history, such as human migrations out of Africa and dispersals to other parts of the globe.

Researchers have been collecting human genomes for years, but the challenge has been in making sense of it all from a larger, holistic perspective. Comparisons of these genomes have been difficult owing to disparate methods of gathering the data, the presence of multiple databases, and variances in terms of data quality and analysis. To compound the problem, each human genome contains segments from multiple ancestries, whether from various ethnic groups or different human populations altogether, such as Neanderthals and Denisovans. These ancestries also exist across vast timescales, which represents yet another challenge. What’s needed are algorithms that can accommodate these challenges, and that’s exactly what the researchers are claiming to have achieved.

To create the map, Wong, with his colleagues, applied a “non-parametric tree-recording method” to modern and ancient human genomes, the oldest of which date back hundreds of thousands of years. I reached out to Sharon Browning, a biostatistician at the University of Washington who wasn’t involved in the research, to get her take on the achievement.

“This paper is primarily about a great new tool for genetic studies called tskit, which is short for ‘tree sequence kit’,” explained Browning in an email. It’s called a tree because, “if you consider one small part of the genome in a number of individuals, and trace back the descent, eventually you get back to a single ancestor, like ‘mitochondrial Eve’ for the mitochondrial genome,” she said. “That single ancestor is the root of the tree, and the set of individuals that you were considering are the tips of the branches of the tree.” Browning said the tree looks different along different parts of the genome because of recombination (when the exchanging of genetic material results in variation), and that tskit is “used to infer the trees along the sequenced genome.”

A reconstruction of the face of a Neanderthal at the National Museum of Antiquities in Leiden, Netherlands.
Photo: Bart Maat / ANP / AFP (Getty Images)

Indeed, the algorithms work by predicting where common ancestors must be present in the evolutionary family tree, by looking at genetic variation. And because the genomes are geotagged, it predicts where these common ancestors lived.

Essentially, we are reconstructing the genomes of our ancestors and using them to form a vast network of relationships,” Anthony Wilder Wohns, the lead author of the study and a researcher at the Big Data Institute, said in the Oxford release. “We can then estimate when and where these ancestors lived. The power of our approach is that it makes very few assumptions about the underlying data and can also include both modern and ancient DNA samples.”

Browning said an earlier version of tskit showed promise, but it turned out to have significant limitations. The researchers have now addressed the limitations, “providing a tool that should be extremely useful across many different types of study,” she said. To which she added: “Although the authors provide a couple of applications, including their cool visualization of where human ancestors came from, the scope of possible applications is very large, and I would expect to see a flurry of activity from researchers developing these.”

Browning cautioned that the trees estimated by tskit “don’t come with uncertainty measures,” so she expects the results will be useful for positing new hypotheses, rather than for proving hypotheses. “Other more specialized methods will still be needed for verification purposes,” she said.

Looking ahead, the team hopes to add new genetic information to the system as it arrives. They don’t expect this to be a problem, as the system can accommodate millions more.

Read original article here

Huge Project Is Now Underway to Sequence The Genome of Every Complex Species on Earth

The Earth Biogenome Project, a global consortium that aims to sequence the genomes of all complex life on earth (some 1.8 million described species) in ten years, is ramping up.

 

The project’s origins, aims and progress are detailed in two multi-authored papers published today. Once complete, it will forever change the way biological research is done.

Specifically, researchers will no longer be limited to a few “model species” and will be able to mine the DNA sequence database of any organism that shows interesting characteristics. This new information will help us understand how complex life evolved, how it functions, and how biodiversity can be protected.

The project was first proposed in 2016, and I was privileged to speak at its launch in London in 2018. It is currently in the process of moving from its startup phase to full-scale production.

The aim of phase one is to sequence one genome from every taxonomic family on earth, some 9,400 of them. By the end of 2022, one-third of these species should be done. Phase two will see the sequencing of a representative from all 180,000 genera, and phase three will mark the completion of all the species.

The importance of weird species

The grand aim of the Earth Biogenome Project is to sequence the genomes of all 1.8 million described species of complex life on Earth. This includes all plants, animals, fungi, and single-celled organisms with true nuclei (that is, all “eukaryotes”).

While model organisms like mice, rock cress, fruit flies and nematodes have been tremendously important in our understanding of gene functions, it’s a huge advantage to be able to study other species that may work a bit differently.

 

Many important biological principles came from studying obscure organisms. For instance, genes were famously discovered by Gregor Mendel in peas, and the rules that govern them were discovered in red bread mold.

DNA was discovered first in salmon sperm, and our knowledge of some systems that keep it secure came from research on tardigrades. Chromosomes were first seen in mealworms and sex chromosomes in a beetle (sex chromosome action and evolution has also been explored in fish and platypus). And telomeres, which cap the ends of chromosomes, were discovered in pond scum.

Answering biological questions and protecting biodiversity

Comparing closely and distantly related species provides tremendous power to discover what genes do and how they are regulated. For instance, in another PNAS paper, coincidentally also published today, my University of Canberra colleagues and I discovered Australian dragon lizards regulate sex by the chromosome neighborhood of a sex gene, rather than the DNA sequence itself.

Scientists also use species comparisons to trace genes and regulatory systems back to their evolutionary origins, which can reveal astonishing conservation of gene function across nearly a billion years. For instance, the same genes are involved in retinal development in humans and in fruit fly photoreceptors. And the BRCA1 gene that is mutated in breast cancer is responsible for repairing DNA breaks in plants and animals.

 

The genome of animals is also far more conserved than has been supposed. For instance, several colleagues and I recently demonstrated that animal chromosomes are 684 million years old.

It will be exciting, too, to explore the “dark matter” of the genome, and reveal how DNA sequences that don’t encode proteins can still play a role in genome function and evolution.

Another important aim of the Earth Biogenome Project is conservation genomics. This field uses DNA sequencing to identify threatened species, which includes about 28 percent of the world’s complex organisms – helping us monitor their genetic health and advise on management.

No longer an impossible task

Until recently, sequencing large genomes took years and many millions of dollars. But there have been tremendous technical advances that now make it possible to sequence and assemble large genomes for a few thousand dollars. The entire Earth Biogenome Project will cost less in today’s dollars than the human genome project, which was worth about US$3 billion in total.

In the past, researchers would have to identify the order of the four bases chemically on millions of tiny DNA fragments, then paste the entire sequence together again. Today they can register different bases based on their physical properties, or by binding each of the four bases to a different dye. New sequencing methods can scan long molecules of DNA that are tethered in tiny tubes, or squeezed through tiny holes in a membrane.

 

Why sequence everything?

But why not save time and money by sequencing just key representative species?

Well, the whole point of the Earth Biogenome Project is to exploit the variation between species to make comparisons, and also to capture remarkable innovations in outliers.

There is also the fear of missing out. For instance, if we sequence only 69,999 of the 70,000 species of nematode, we might miss the one that could divulge the secrets of how nematodes can cause diseases in animals and plants.

There are currently 44 affiliated institutions in 22 countries working on the Earth Biogenome Project. There are also 49 affiliated projects, including enormous projects such as the California Conservation Genomics Project, the Bird 10,000 Genomes Project and UK’s Darwin Tree of Life Project, as well as many projects on particular groups such as bats and butterflies.

Jenny Graves, Distinguished Professor of Genetics and Vice Chancellor’s Fellow, La Trobe University.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

 

Read original article here

Artificial Intelligence Has Found an Unknown ‘Ghost’ Ancestor in The Human Genome

Nobody knows who she was, just that she was different: a teenage girl from over 50,000 years ago of such strange uniqueness she looked to be a ‘hybrid’ ancestor to modern humans that scientists had never seen before.

 

Only recently, researchers have uncovered evidence she wasn’t alone. In a 2019 study analysing the complex mess of humanity’s prehistory, scientists used artificial intelligence (AI) to identify an unknown human ancestor species that modern humans encountered – and shared dalliances with – on the long trek out of Africa millennia ago.

“About 80,000 years ago, the so-called Out of Africa occurred, when part of the human population, which already consisted of modern humans, abandoned the African continent and migrated to other continents, giving rise to all the current populations”, explained evolutionary biologist Jaume Bertranpetit from the Universitat Pompeu Fabra in Spain.

As modern humans forged this path into the landmass of Eurasia, they forged some other things too – breeding with ancient and extinct hominids from other species.

Up until recently, these occasional sexual partners were thought to include Neanderthals and Denisovans, the latter of which were unknown until 2010.

But in this study, a third ex from long ago was isolated in Eurasian DNA, thanks to deep learning algorithms sifting through a complex mass of ancient and modern human genetic code.

 

Using a statistical technique called Bayesian inference, the researchers found evidence of what they call a “third introgression” – a ‘ghost’ archaic population that modern humans interbred with during the African exodus.

“This population is either related to the Neanderthal-Denisova clade or diverged early from the Denisova lineage,” the researchers wrote in their paper, meaning that it’s possible this third population in humanity’s sexual history was possibly a mix themselves of Neanderthals and Denisovans.

In a sense, from the vantage point of deep learning, it’s a hypothetical corroboration of sorts of the teenage girl ‘hybrid fossil’ identified in 2018; although there’s still more work to be done, and the research projects themselves aren’t directly linked.

“Our theory coincides with the hybrid specimen discovered recently in Denisova, although as yet we cannot rule out other possibilities”, one of the team, genomicist Mayukh Mondal from the University of Tartu in Estonia, said in a press statement at the time of discovery.

That being said, the discoveries being made in this area of science are coming thick and fast.

Also in 2018, another team of researchers identified evidence of what they called a “definite third interbreeding event” alongside Denisovans and Neanderthals, and a pair of papers published in early 2019 traced the timeline of how those extinct species intersected and interbred in clearer detail than ever before.

 

There’s a lot more research to be done here yet. Applying this kind of AI analysis is a decidedly new technique in the field of human ancestry, and the known fossil evidence we’re dealing with is amazingly scant.

But according to the research, what the team has found explains not only a long-forgotten process of introgression – it’s a dalliance that, in its own way, informs part of who we are today.

“We thought we’d try to find these places of high divergence in the genome, see which are Neanderthal and which are Denisovan, and then see whether these explain the whole picture,” Bertranpetit told Smithsonian.

“As it happens, if you subtract the Neanderthal and Denisovan parts, there is still something in the genome that is highly divergent.”

The findings were published in Nature Communications.

A version of this article was originally published in February 2019.

 

Read original article here

The Closest Related Virus to SARS-CoV-2 Has Just Been Discovered, And It’s in Bats

Researchers have discovered coronaviruses lurking in Laotian bats that appear to be the closest known relatives to SARS-CoV-2, the virus that causes COVID-19, found to date, according to news reports.

 

In a new study, researchers from the Pasteur Institute in France and the University of Laos captured 645 bats from limestone caves in northern Laos and screened them for viruses related to SARS-CoV-2. They found three viruses – which they dubbed BANAL-52, BANAL-103 and BANAL-236 – that infected horseshoe bats and shared more than 95 percent of their overall genome with SARS-CoV-2.

One of the viruses, BANAL-52, was 96.8 percent identical to SARS-CoV-2, according to Nature News. That makes BANAL-52 more genetically similar to SARS-CoV-2 than any other known virus.

Previously, the closest known relative to SARS-CoV-2 was RaTG13, which was found in horseshoe bats in 2013 and shares 96.1 percent of its genome with SARS-CoV-2, Nature News reported.

Related: 7 facts about the origin of the novel coronavirus

What’s more, all three of the newly discovered viruses are more similar to SARS-CoV-2 in a key part of their genome – called the receptor binding domain (RBD) – than other known viruses.

The RBD is the part of the virus that allows it to bind to host cells. With SARS-CoV-2, the RBD binds to a receptor known as ACE2 on human cells, and the virus uses this receptor as a gateway into cells.

 

Critically, the new study found that BANAL-52, BANAL-103 and BANAL-236 can bind to ACE2 and use it to enter human cells. So far, other candidates proposed as ancestors of SARS-CoV-2 found in bats, including RaTG13, haven’t been able to do this, the researchers said.

The three viruses could bind to ACE2 about as well as early strains of SARS-CoV-2 found in Wuhan, they said.

The findings, which were posted to the preprint server Research Square on Septembe 17, add to the evidence that SARS-CoV-2 had a natural origin, rather than escaping from a lab.

The results show “that sequences very close to those of the early strains of SARS-CoV-2… exist in nature,” the researchers wrote in their paper, which has yet to be peer-reviewed.

“The receptor binding domain of SARS-CoV-2 looked unusual when it was first discovered because there were so few viruses to compare it to,” Edward Holmes, an evolutionary biologist at the University of Sydney, who wasn’t involved in the research, told Bloomberg.

“Now that we are sampling more from nature, we are starting to find these closely related bits of gene sequence,” Holmes said.

 

The authors say their findings support the hypothesis that SARS-CoV-2 resulted from a recombination of viral sequences existing in horseshoe bats.

Still, even though the newly discovered viruses are closely related to SARS-CoV-2, all three viruses lack a sequence for what is known as the “furin cleavage site,” which is seen in SARS-CoV-2 and aids the virus’s entry into cells, according to Nature News. This means that in order to better understand the origins of SARS-CoV-2 further research is needed to show how and when the furin site was introduced.

The findings are currently being considered for publication in a Nature journal, Bloomberg reported.

Related content:

Coronavirus variants: Here’s how the mutants stack up

11 (sometimes) deadly diseases that hopped across species

14 coronavirus myths busted by science

This article was originally published by Live Science. Read the original article here.

 

Read original article here