Tag Archives: genome

Study explores incidence, severity, and long COVID associations of SARS-CoV-2 reinfections

In a recent study posted to the medRxiv* preprint server, a team of researchers from the United States used electronic health records to characterize the incidence, biomarkers, attributes, and severity of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) reinfections and evaluated the association between reinfections and long coronavirus disease (COVID).

Study: SARS-CoV-2 Reinfection is Preceded by Unique Biomarkers and Related to Initial Infection Timing and Severity: an N3C RECOVER EHR-Based Cohort Study. Image Credit: Ralf Liebhold/Shutterstock

Background

The emergent SARS-CoV-2 variants are increasing the incidence of breakthrough infections. Mutations in spike protein regions of these variants that increase immune escape, combined with the waning of the immunity induced by coronavirus disease 2019 (COVID-19) vaccines and previous SARS-CoV-2 infections are resulting in a rise in reinfections. Studies based on whole genome sequences of the SARS-CoV-2 variants isolated from reinfected patients have revealed that the variants responsible for reinfections are distinct from those that caused the earlier infections. However, there is a dearth of information on whether reinfections differ from the initial infection in their incidence, severity, and attributes, as well as on the long COVID complications after SARS-CoV-2 reinfections.

About the study

In the present study, the team used electronic health record data of a cohort exceeding 1.5 million individuals involved in the National COVID Cohort Collaborative (N3C), which is a part of the National Institute of Health’s Researching COVID to Enhance Recovery (RECOVER) initiative. This data was used to evaluate the incidence, biomarkers, and attributes of SARS-CoV-2 reinfections and understand the association between post-acute sequelae of SARS-CoV-2 infection (PASC) and reinfections.

Reinfection was defined based on a positive SARS-CoV-2 antigen or polymerase chain reaction (PCR) test more than 60 days after the index date for the initial SARS-CoV-2 infection. Long COVID was defined based on the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) codes.

Reinfections were also examined according to the epochs of SARS-CoV-2 variants, with the epoch of the wild-type strain spanning the March to November 2020 period, the Alpha, Beta, and Gamma variants dominating the December 2020–May 2021 period, and the Delta variant epoch spanning the June 2021–October 2022 period. The Omicron epoch was divided into two parts for the Omicron variant and the Omicron BA variants, corresponding to November 2021–March 2022 and March–August 2022, respectively.

Biomarkers such as inflammation, coagulopathies, and organ dysfunction can be used to characterize SARS-CoV-2 infections. A wide range of biomarkers, including laboratory measurements of white blood cell counts, erythrocyte sedimentation rates, C-reactive protein, serum creatinine, albumin, and many more, were used to characterize reinfections.

COVID-associated hospitalization data was used to determine the severity of reinfections. Mild infections included those that did not require a visit to the emergency department or hospitalization, while those requiring hospitalization were categorized as moderately severe, and cases requiring hospitalization, invasive mechanical ventilators, vasopressors, or extracorporeal membrane oxygenation were considered severe infections.

The period between reinfection and long COVID diagnoses was compared with that between the initial infection and diagnosis of long COVID to understand the relationship between reinfections and PASC.

Results

The results indicated that most individuals in the cohort had one reinfection, with a small group comprising largely of non-Hispanic White males and older individuals having had three or more reinfections. The largest number of reinfections during the Omicron epoch were among individuals who had initial SARS-CoV-2 infections during the epochs of the wild-type, Alpha, Beta, and Gamma strains, followed by reinfections among those with initial Delta infections.

Analyses of biomarkers revealed that compared to the initial SARS-CoV-2 infection, reinfections showed lower elevation of hepatic inflammation markers such as alanine transaminase (ALT) and aspartate transaminase (AST). However, albumin levels were consistently low in reinfection patients.

Furthermore, the severity of reinfections was found to be associated with the severity of the initial SARS-CoV-2 infections. A majority of the cohort experienced mild symptoms during the initial infections and reinfections and did not require hospitalization or a visit to the emergency department. Compared to the initial infection, the percentage of individuals who required hospitalization or succumbed to the infection after reinfection was marginally lower (14.4% vs. 12.6%). Close to half the patients who experienced a severe initial SARS-CoV-2 infection had moderate symptoms requiring hospitalization or emergency department visits during reinfection. Additionally, 7.4% of the individuals who had a severe initial infection had severe infections, and 5.7% succumbed to the reinfection.

Long COVID diagnoses also occurred in a shorter time frame for infections or reinfections during the Omicron epoch, as compared to infections during the Delta epoch or those with other variants.

Conclusions

Overall, the results indicated that the severity of SARS-CoV-2 reinfections was similar to those of the initial infection, with individuals who experienced mild to moderate symptoms during the first infection having similar symptoms during reinfection, while individuals who experienced a severe initial infection having similar reinfection symptoms or succumbing to the disease after reinfection.

Additionally, the study reported that long COVID diagnoses during the Omicron epoch occurred much closer to the index date of the infection or reinfection, and the number of long COVID diagnoses also showed an increase after reinfections with recent variants.

*Important notice

medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:

  • Emily Hadley, Yun Jae Yoo, Saaya Patel, Andrea Zhou, Bryan Laraway, Rachel Wong, Alexander Preiss, Rob Chew, Hannah Davis, Christopher G Chute, Emily R Pfaff, Johanna Loomba, Melissa Haendel, Elaine Hill, Richard Moffitt. (2023). SARS-CoV-2 Reinfection is Preceded by Unique Biomarkers and Related to Initial Infection Timing and Severity: an N3C RECOVER EHR-Based Cohort Study:  and the N3C and RECOVER consortia. medRxiv. doi: https://doi.org/10.1101/2023.01.03.22284042 https://www.medrxiv.org/content/10.1101/2023.01.03.22284042v1

Read original article here

Genome sequencing trial to test benefits of identifying genetic diseases at birth | Genetics

Genomics England is to test whether sequencing babies’ genomes at birth could help speed up the diagnosis of about 200 rare genetic diseases, and ensure faster access to treatment.

The study, which will sequence the genomes of 100,000 babies over the next two years, will explore the cost-effectiveness of the approach, as well as how willing new parents are to accept it.

Although researchers will only search babies’ genomes for genetic conditions that surface during early childhood, and for which an effective treatment already exists, their sequences will be held on file. This could open the door to further tests that could identify untreatable adult onset conditions, or other genetically determined traits, in the future.

“One challenging thing with newborn genomes is that they will potentially accompany people from cradle to grave,” said Sarah Norcross, director of the Progress Educational Trust (PET), an independent charity that improves choices for people affected by infertility and genetic conditions.

Ensuring the privacy of this data is therefore essential. “People must be able to trust that any data collected will only be used in the agreed way, and for the stated purpose,” Norcross said.

Each year, approximately 3,000 children are born in the UK with a treatable rare condition that could be detected using genome sequencing. Although newborn babies are currently offered a heel-prick test to screen their blood for signs of nine rare but serious conditions, such as sickle cell disease and cystic fibrosis, whole genome sequencing could enable hundreds more such conditions to be diagnosed at birth.

Currently, such diseases are usually only diagnosed once a child develops symptoms, often after months or years of tests. One such condition is biotinidase deficiency, an inherited disorder in which the body is unable to recycle the vitamin biotin. Affected children may experience seizures and delays in reaching developmental milestones, and have problems with vision or hearing, but early diagnosis and treatment with biotin supplements can prevent this deterioration and keep them healthy.

Dr Richard Scott, chief medical officer at Genomics England, said: “At the moment, the average time to diagnosis in a rare disease is about five years. This can be an extraordinary ordeal for families, and it also puts pressure on the health system. The question this programme is responding to is: ‘is there a way that we can get ahead of this?’”

The study aims to recruit 100,000 newborn children to undergo voluntary whole genome sequencing over the next two years, to assess the feasibility and effectiveness of the technology – including whether it could save the NHS money by preventing serious illness.

It will also explore how researchers might access an anonymised version of this database to study people as they grow older, and whether a person’s genome might be used throughout their lives to inform future healthcare decisions. For instance, if someone develops cancer when they are older, there may be an opportunity to use their stored genetic information to help diagnose and treat them.

According to research commissioned by PET earlier this year, 57% of the UK public would support the storage of genetic data in a national database, provided it were only accessible to the sequenced individual and healthcare professionals involved in their care. Only 12% of people opposed this.

Of greater concern would be the storage of a person’s genetic data for use by government authorities including the police, with the person being identifiable to those authorities. This was supported by 40% of people, and opposed by 25%. Norcross said that while Genomics England has good safeguards in place for providing research access to genomic data, “this risk can never be eliminated completely”.

Scott stressed that the purpose of the trial was to explore whether the potential benefits of newborn sequencing stack up, and engage in a genuine national debate about whether the technology is something people feel comfortable with. “The bottom line here is about us taking a cautious approach, and developing a view jointly nationally about what the right approach is, and what the right safeguards are,” he said.

Others raised concerns about the potential for false or uncertain results. Frances Flinter, emeritus professor of clinical genetics at Guy’s & St Thomas NHS foundation trust, and a member of the Nuffield council on bioethics, said: “Using whole genome sequencing to screen newborn babies is a step into the unknown. Getting the balance of benefit and harm right will be crucial. The potential benefits are early diagnosis and treatment for more babies with genetic conditions. The potential harms are false or uncertain results, unnecessary anxiety for parents, and a lack of good follow-up care for babies with a positive screening result.

“We must not race to use this technology before both the science and ethics are ready. This research programme could provide new and important evidence on both. We just hope the question of whether we should be doing this at all is still open.”

Read original article here

World’s Largest Autism Whole Genome Sequencing Study Reveals 134 Autism-Linked Genes

Summary: Researchers have identified 134 genes associated with autism and a range of genetic alterations associated with ASD. Notably, the study identified changes in copy number variations with likely associations with ASD, including autism-associated variants in 14% of people on the autism spectrum.

Source: Hospital for Sick Children

Researchers from The Hospital for Sick Children (SickKids) have uncovered new genes and genetic changes associated with autism spectrum disorder (ASD) in the largest autism whole genome sequencing analysis to date, providing better understanding into the ‘genomic architecture’ that underlies this disorder.

The study, published today in Cell, used whole genome sequencing (WGS) to examine the entire genomes of over 7,000 individuals with autism as well as an additional 13,000 siblings and family members.

The team found 134 genes linked with ASD and discovered a range of genetic changes, most notably gene copy number variations (CNVs), likely to be associated with autism, including ASD-associated rare variants in about 14 per cent of participants with autism.

The majority of data was drawn from the Autism Speaks MSSNG database, the world’s largest autism whole genome dataset, which provides autism researchers with free, open access to thousands of sequenced genomes.

“By sequencing the entire genome of all participants, and with deep involvement from the participating families in MSSNG on forming our research priorities, we maximize the potential for discovery and allow analysis that encompasses all types of variants, from the smallest DNA changes to those that affect entire chromosomes,” says Dr. Stephen Scherer, Senior Scientist, Genetics & Genome Biology and Chief of Research at SickKids and Director of the McLaughlin Centre at the University of Toronto.

Dr. Brett Trost, lead author of the paper and a Research Associate in the Genetics & Genome Biology program at SickKids, notes the use of WGS allowed researchers to uncover variant types that would not have otherwise been detectable.

These variant types include complex rearrangements of DNA, as well as tandem repeat expansions, a finding supported by recent SickKids research on the link between autism and DNA segments that are repeated many times.

The role of the maternally inherited mitochondrial DNA was also examined in the study and found to account for two percent of autism.

The paper also points to important nuances in autism genetics in families with only one individual with autism compared with families that have multiple individuals with autism, known as multiplex families.

Surprising to the team was that the “polygenic score” – an estimation of the likelihood of an individual having autism, calculated by aggregating the effects of thousands of common variants throughout the genome – was not higher among multiplex families.

“This suggests that autism in multiplex families may be more likely to be linked to rare, highly impactful variants inherited from a parent. Because both the genetics and clinical traits associated with autism are so complex and varied, large data sets like the ones we used are critical to providing researchers with a clearer understanding of the genetic architecture of autism,” says Trost.

The research team says the study data can help expand inquiries into the range of variants that might be linked to ASD, as well as efforts to better understand contributors to the 85 per cent of autistic individuals for which the genetic cause remains unresolved. Image is in the public domain

The research team says the study data can help expand inquiries into the range of variants that might be linked to ASD, as well as efforts to better understand contributors to the 85 percent of autistic individuals for which the genetic cause remains unresolved.

In a linked study of 325 families with ASD from Newfoundland published this same month in Nature Communications, Dr. Scherer’s team found that combinations of spontaneous, rare-inherited, and polygenic genetic factors coming together in the same individual can potentially lead to different sub-types of autism.

Dr. Suzanne Lewis, a geneticist and investigator at the BC Children’s Hospital Research Institute who diagnosed many of the families enrolled in the study said, “Collectively, these latest findings represent a massive step forward in better understanding the complex genetic and biological circuitry linked with ASD.

“This rich data set also offers an opportunity to dive deeper into examining other factors that may determine an individual’s chance of developing this complex condition to help individualize future treatment approaches.”

Funding: Funding for this study was provided by the University of Toronto McLaughlin Centre, Genome Canada/Ontario Genomics, Genome BC, Government of Ontario, Canadian Institutes of Health Research, Canada Foundation for Innovation, Autism Speaks, Autism Speaks Canada, Brain Child, Kids Brain Health Network, Qatar National Research Fund, Ontario Brain Institute, SFARI and SickKids Foundation.

See also

About this genetics and autism research news

Author: Jelena Djurkic
Source: Hospital for Sick Children
Contact: Jelena Djurkic – Hosptial for Sick Children
Image: The image is in the public domain

Original Research: Closed access.
“Genomic architecture of autism from comprehensive whole-genome sequence annotation” by Stephen Scherer, et al. Cell


Abstract

Genomic architecture of autism from comprehensive whole-genome sequence annotation

Highlights

  • New MSSNG release contains WGS from 11,312 individuals from families with ASD
  • Extensive variant data available, including SNVs/indels, SVs, tandem repeats, and PRS
  • Annotation reveals 134 ASD-associated genes, plus SVs not detectable without WGS
  • Rare, dominant variation has a prominent role in multiplex ASD

Summary

Fully understanding autism spectrum disorder (ASD) genetics requires whole-genome sequencing (WGS). We present the latest release of the Autism Speaks MSSNG resource, which includes WGS data from 5,100 individuals with ASD and 6,212 non-ASD parents and siblings (total n = 11,312).

Examining a wide variety of genetic variants in MSSNG and the Simons Simplex Collection (SSC; n = 9,205), we identified ASD-associated rare variants in 718/5,100 individuals with ASD from MSSNG (14.1%) and 350/2,419 from SSC (14.5%).

Considering genomic architecture, 52% were nuclear sequence-level variants, 46% were nuclear structural variants (including copy-number variants, inversions, large insertions, uniparental isodisomies, and tandem repeat expansions), and 2% were mitochondrial variants.

Our study provides a guidebook for exploring genotype-phenotype correlations in families who carry ASD-associated rare variants and serves as an entry point to the expanded studies required to dissect the etiology in the ∼85% of the ASD population that remain idiopathic.

Read original article here

Nearly 1,000 Microbe Species Have Just Been Discovered in ‘Extreme’ Tibetan Glaciers

Living as a microbe on the Tibetan Plateau isn’t easy. Frigid temperatures, high levels of solar radiation, not a lot to eat, and you’d regularly get frozen and then thawed depending on the time of year.

 

So, it’s a bit of a surprise that in these ‘extreme environmental conditions’ scientists have discovered 968 species featuring a hugely diverse range of microbes. The finding comes courtesy of the first dedicated genome catalog of the glacier ecosystem.

“The surfaces of glaciers support a diverse array of life, including bacteria, algae, archaea, fungi, and other microeukaryotes. Microorganisms have demonstrated the ability to adapt to these extreme conditions and contribute to vital ecological processes,” writes the team in their new paper.

“Glacier ice can also act as a record of microorganisms from the past, with ancient (more than 10,000 years old) airborne microorganisms being successfully revived. Therefore, the glacial microbiome also constitutes an invaluable chronology of microbial life on our planet.”

The researchers honed in on one specific group of glaciers – the Tibetan Plateau. This 2.5 million square kilometer region is an important water source for the surrounding areas in Asia and has been particularly affected by climate change, with over 80 percent of glaciers having started to retreat.

Not only is it important for us to know which microbes are up there (just in case they could be a problem for humans and the ecosystem as the ice melts), but if we don’t note what species are currently there, climate change might soon make them lost to history.

 

“Here we present the first, to our knowledge, dedicated genome and gene catalog for glacier ecosystems, comprising 3,241 genomes and metagenome-assembled genomes and 25 million non-redundant proteins from 85 Tibetan glacier metagenomes and 883 cultivated isolates,” the team, led by Lanzhou University ecologist Yongqin Liu, writes in their paper.

The researchers undertook a mammoth effort, sampling snow, ice, and dust from 21 Tibetan glaciers between 2016 and 2020. They used metagenomic methods on the samples to collect all of the genetic material present; they also cultured some of the microbes in a lab to find out more about them and to retrieve a higher proportion of their genome.

Excitingly, 82 percent of the genomes were novel species. A whopping 11 percent of species were found only in one glacier, while 10 percent were located in almost all the glaciers studied.

The project has become what the researchers are calling the ‘Tibetan Glacier Genome and Gene’ (TG2G) catalog, and hopefully this will be of use for researchers in the future, with new additions as more species are found.

“The TG2G catalog offers a database and a platform for archiving, analysis and comparison of glacier microbiomes at the genome and gene levels. It is particularly timely as the glacier ecosystem is threatened by global warming, and glaciers are retreating at an unprecedented rate,” the team writes.

“We envisage that the catalog will form the basis of a comprehensive global repository for glacial microbiome data.”

The research has been published in Nature Biotechnology.

 

Read original article here

Genome Analysis Now Allows Scientists To Predict if You Will Have a Miscarriage

The researchers discovered three genes, MCM5, FGGY, and DDX60L, that are strongly linked to the risk of developing eggs with an abnormal number of chromosomes when the genes mutated.

In order to shed light on the genetic cause of female infertility, Rutgers researchers have combined genomic sequencing with machine learning techniques

According to Rutgers University research, specialized analysis of a woman’s genome may be used to predict her likelihood of experiencing one of the most common forms of miscarriages.

This knowledge, according to scientists, could help patients and doctors make more educated judgments about their reproductive options and fertility treatment strategies.

Rutgers researchers describe a technique that combines genomic sequencing with machine-learning methods to predict the likelihood of a woman miscarrying due to egg aneuploidy – a term that describes a human egg with an abnormal number of chromosomes – in a recent study published in the journal Human Genetics.

Infertility is a serious reproductive health condition that affects around 12% of women of reproductive age in the United States. Aneuploidy in human eggs causes early miscarriage and in vitro fertilization (IVF) failure and accounts for a major percentage of infertility.

Recent research has demonstrated that some genes predispose specific women to aneuploidy, although the precise genetic origins of aneuploid egg production remain unknown. The Rutgers research is the first to assess how strongly particular genetic variants in the mother’s genome predict a woman’s infertility risk.

“The goal of our project was to understand the genetic cause of female infertility and develop a method to improve the clinical prognosis of patients’ aneuploidy risk,” said Jinchuan Xing, an author of the study and an associate professor in the genetics department at the Rutgers School of Arts and Sciences. “Based on our work, we showed that the risk of embryonic aneuploidy in female IVF patients can be predicted with high

While age is a predictive factor for aneuploidy, it is not a highly accurate gauge because aneuploidy rates within individuals of the same age can vary dramatically. Identifying genetic variations with more predictive power arms women and their treating clinicians with better information, Xing said.

“I like to think of the coming era of genetic medicine when a woman can enter a doctor’s office or, in this case, perhaps, a fertility clinic with her genomic information, and have a better sense of how to approach treatment,” Xing said. “Our work will enable such a future.”

The study was funded by the Eunice Kennedy Shriver National Institute of Child Health and Human Development, the National Institute of General Medical Sciences, and the National Institute of Mental Health. 

Reference: “Predicting embryonic aneuploidy rate in IVF patients using whole-exome sequencing” by Siqi Sun, Maximilian Miller, Yanran Wang, Katarzyna M. Tyc, Xiaolong Cao, Richard T. Scott Jr., Xin Tao, Yana Bromberg, Karen Schindler and Jinchuan Xing, 26 March 2022, Human Genetics.
DOI: 10.1007/s00439-022-02450-z



Read original article here

Gallstone from a mummified 16th century prince used to reconstruct the ancient genome of E. coli

When you think about precious crown jewels, a 400-year-old gallstone is probably not what springs to mind!

However, a team of scientists have found something very valuable inside calcified balls extracted from a 16th century Italian prince’s gallbladder.

Remnants of early E. Coli were found to be present, and researchers from McMaster University in Canada have used them to reconstruct the first ancient genome of the bacteria.

This can act as a ‘point of comparison’ to tell us information about how the notorious superbug has evolved over the past 400 years.

The findings, published today in the journal Communications Biology, could allow researchers to eventually pinpoint when E. Coli acquired antibiotic resistance.

Remnants of early E. Coli bacteria were present in the gallstones of a mummified Italian prince 

The liver and gallbladder of Giovani d’Avalos. The gallstones can be seen in the red rectangle, which contain fragments of E. Coli. Scale bar represents 1cm

George Long (pictured) is co-lead author of the study and said ‘we were able to identify what was an opportunistic pathogen, dig down to the functions of the genome, and provide guidelines to aid researchers who may be exploring other, hidden pathogens’

The mummified remains of Giovani d’Avalos were recovered from the Abbey of Saint Domenico Maggiore in Naples in 1983, along with those of other Italian nobles from the Renaissance period.

The Neapolitan nobleman, who died in 1586 aged 48, is thought to have suffered from chronic inflammation of the gallbladder due to gallstones. 

Study lead author George Long said: ‘When we were examining these remains, there was no evidence to say this man had E. coli.

‘Unlike an infection like smallpox, there are no physiological indicators. No one knew what it was.’

E. coli, or Escherichia coli, can infect the organs that contribute to the production and transportation of bile, including the gall bladder.

It is able to release an enzyme that can turn bilirubin, a chemical produced during the normal breakdown of haemoglobin, into calcium salts – the first step in pigment stone formation.

As well as contributing to gallstone formation, E. Coli can cause food poisoning, diarrhoea, urinary tract infections and pneumonia.

It is known as a ‘commensal’ – a bacteria that resides within us and can act as an opportunistic pathogen infecting its host during periods of stress, underlying disease or immunodeficiency.

E. Coli is also known to be resistant to antibiotics, giving it its title as a ‘superbug’.

E. Coli (pictured) is also known to be resistant to antibiotics, giving it its title as a ‘superbug’

WHAT IS E. COLI AND WHY IS IT DANGEROUS?

E. coli (Escherichia coli) are bacteria that generally live in the intestines of healthy people and animals.

Infections can occur after coming into contact with the faeces of humans or animals, or by eating contaminated food or drinking contaminated water.

Symptoms of an E.coli infection include bloody diarrhea, stomach cramps, nausea and vomiting.

In rare cases, sufferers can develop a type of kidney failure called hemolytic uremic syndrome (HUS).

This is a condition in which there is an abnormal destruction of blood platelets and red blood cells.

According to the Mayo Clinic, the damaged blood cells can clog the kidney’s filtering system, resulting in life-threatening kidney failure.

No treatment currently exists to treat these infections. They usually disappear within one week, but medical professionals recommend resting and drinking fluids to help prevent dehydration and fatigue.

<!- - ad: https://mads.dailymail.co.uk/v8/de/sciencetech/none/article/other/mpu_factbox.html?id=mpu_factbox_1 - ->

Advertisement

Researchers had to meticulously isolate fragments of the target bacterium, which had been degraded by environmental contamination from several sources. 

They used the recovered material to reconstruct the first ancient E. Coli genome.

However, the research team explained that its full evolutionary history remains a mystery, including when it acquired antibiotic resistance.

Research leader Professor Hendrik Poinar said ‘A strict focus on pandemic-causing pathogens as the sole narrative of mass mortality in our past misses the large burden that stems from opportunistic commmensals driven by the stress of lives lived.’

Evolutionary geneticist Prof Poinar, of Canada’s McMaster University which led the research, said: ‘Modern E. coli is commonly found in the intestines of healthy people and animals.

‘While most forms are harmless, some strains are responsible for serious, sometimes fatal food poisoning outbreaks and bloodstream infections. The hardy and adaptable bacterium is recognised as especially resistant to treatment.’

He explained that having the genome of a 400-year-old ancestor to the modern bacterium provides researchers a ‘point of comparison’ for studying how it has evolved and adapted since that time.

He explained that the technological feat is particularly remarkable because E. coli is both ‘complex and ubiquitous’ – living not only in the soil but also in our own microbiomes.

Professor Erick Denamur, from the Paris Diderot University, said: ‘It was so stirring to be able to type this ancient E. coli and find that while unique it fell within a phylogenetic lineage characteristic of human commensals that is today still causing gallstones.’

Long added: ‘We were able to identify what was an opportunistic pathogen, dig down to the functions of the genome, and provide guidelines to aid researchers who may be exploring other, hidden pathogens.’

WHAT IS A GENOME?

An organism’s genome is written in a chemical code called DNA.

DNA, or deoxyribonucleic acid, is a complex chemical in almost all organisms that carries genetic information.

It is located in chromosomes the cell nucleus and almost every cell in a person’s body has the same DNA.  

The human genome is composed of more than three billion pairs of these building-block molecules and grouped into some 25,000 genes.

It contains the codes and instructions that tell the body how to grow and develop, but flaws in the instructions can lead to disease.

Currently, less than 0.2 per cent of the Earth’s species have been sequenced.

The first decoding of a human genome – completed in 2003 as part of the Human Genome Project – took 15 years and cost £2.15 billion ($3bn).

A group of 24 international scientists want to collect and store the genetic codes of all 1.5 million known plants, animals and fungi over the next decade.

The resulting library of life could be used by scientists to find out more about the evolution of species and how to improve our environment.

The £3.4 billion ($4.7bn) project is being described as the ‘most ambitious project in the history of modern biology’.

<!- - ad: https://mads.dailymail.co.uk/v8/de/sciencetech/none/article/other/mpu_factbox.html?id=mpu_factbox_2 - ->

Advertisement

Read original article here

Scientists Have Finally Sequenced the Complete Human Genome – And Revealed New Genetic Secrets

Sequencing the last 8% of the human genome has taken 20 years and the invention of new techniques for reading long sequences of the genetic code, which consists of the nucleotides C, T, G and A. The entire genome consists of more than 3 billion nucleotides. Credit: Ernesto del Aguila III, NHGRI

Repetitive

The spindles (green) that pull chromosomes apart during cell division are attached to a protein complex called the kinetochore, which latches onto the chromosome at a place called the centromere — a region containing highly repetitive DNA sequences. Comparing the sequences of these repeats revealed where mutations have accumulated over millions of years, reflecting the relative age of each repeat. Repeats in the active centromere tend to be the youngest and most recently duplicated sequences in the region, and they have strikingly low DNA methylation. Surrounding the active centromere on both sides are older repeats, probably the relics of former centromeres, with the oldest ones farthest from the active centromere. The researchers hope that new experimental methods will help reveal why centromeres evolve from the middle, as well as why this pattern is so closely associated with binding by the kinetochore and with low DNA methylation. Credit: Nicolas Altemose, UC Berkeley

“Without proteins, DNA is nothing,” said Altemose, who earned a Ph.D. in bioengineering jointly from UC Berkeley and UC San Francisco in 2021 after having received a D.Phil. in statistics from Oxford University. “DNA is a set of instructions with no one to read it if it doesn’t have proteins around to organize it, regulate it, repair it when it’s damaged and replicate it. Protein-DNA interactions are really where all the action is happening for genome regulation, and being able to map where certain proteins bind to the genome is really important for understanding their function.”

After the T2T consortium sequenced the missing DNA, Altemose and his team used new techniques to find the place within the centromere where a big protein complex called the kinetochore solidly grips the chromosome so that other machines inside the nucleus can pull chromosome pairs apart.

“When this goes wrong, you end up with missegregated chromosomes, and that leads to all kinds of problems,” he said. “If that happens in meiosis, that means you can have chromosomal anomalies leading to spontaneous miscarriage or congenital diseases. If it happens in somatic cells, you can end up with cancer — basically, cells that have massive misregulation.”

What they found in and around the centromeres were layers of new sequences overlaying layers of older sequences, as if through evolution new centromere regions have been laid down repeatedly to bind to the kinetochore. The older regions are characterized by more random mutations and deletions, indicating they’re no longer used by the cell. The newer sequences where the kinetochore binds are much less variable, and also less methylated. The addition of a methyl group is an epigenetic tag that tends to silence genes.

All of the layers in and around the centromere are composed of repetitive lengths of DNA, based on a unit about 171 base pairs long, which is roughly the length of DNA that wraps around a group of proteins to form a nucleosome, keeping the DNA packaged and compact. These 171 base pair units form even larger repeat structures that are duplicated many times in tandem, building up a large region of repetitive sequences around the centromere.

The T2T team focused on only one human genome, obtained from a non-cancerous tumor called a hydatidiform mole, which is essentially a human embryo that rejected the maternal DNA and duplicated its paternal DNA instead. Such embryos die and transform into tumors. But the fact that this mole had two identical copies of the paternal DNA — both with the father’s X chromosome, instead of different DNA from both mother and father — made it easier to sequence.

The researchers also released this week the complete sequence of a Y chromosome from a different source, which took nearly as long to assemble as the rest of the genome combined, Altemose said. The analysis of this new Y chromosome sequence will appear in a future publication.

When the researchers compared centromeric regions of 1,600 people from around the world, they found that those without recent African ancestry mostly had two types of sequence variations. The proportions of these two variations are represented by the black and light gray wedges within the circles, which are placed on the map near the location where each group of individuals was sampled. Those from Africa or other areas with a large proportion of people with recent African ancestry, like the Caribbean, had much more centromeric sequence variation, represented by the multi-colored wedges. Such variations could help track how centromeric regions evolve, as well as how these genetic variants are related to health and disease. Credit: Nicolas Altemose, UC Berkeley

Altemose and his team, which included UC Berkeley project scientist Sasha Langley, also used the new reference genome as a scaffold to compare the centromeric DNA of 1,600 individuals from around the world, revealing major differences in both the sequence and copy number of repetitive DNA around the centromere. Previous studies have shown that when groups of ancient humans migrated out of Africa to the rest of the world, they took only a small sample of genetic variants with them. Altemose and his team confirmed that this pattern extends into centromeres.

“What we found is that in individuals with recent ancestry outside the African continent, their centromeres, at least on chromosome X, tend to fall into two big clusters, while most of the interesting variation is in individuals who have recent African ancestry,” Altemose said. “This isn’t entirely a surprise, given what we know about the rest of the genome. But what it suggests is that if we want to look at the interesting variation in these centromeric regions, we really need to have a focused effort to sequence more African genomes and do complete telomere-to-telomere sequence assembly.”

DNA sequences around the centromere could also be used to trace human lineages back to our common ape ancestors, he noted.

“As you move away from the site of the active centromere, you get more and more degraded sequence, to the point where if you go out to the furthest shores of this sea of repetitive sequences, you start to see the ancient centromere that, perhaps, our distant primate ancestors used to bind to the kinetochore,” Altemose said. “It’s almost like layers of fossils.”

Long-read sequencing a game changer

The T2T’s success is due to improved techniques for sequencing long stretches of DNA at once, which helps when determining the order of highly repetitive stretches of DNA. Among these are PacBio’s HiFi sequencing, which can read lengths of more than 20,000 base pairs with high

One reason it took 20 years to complete the human genome sequence: much of our DNA is extremely repetitive. Credit: Infographic courtesy of NHGRI, NIH

“These new long-read DNA sequencing technologies are just incredible; they’re such game changers, not only for this repetitive DNA world, but because they allow you to sequence single long molecules of DNA,” Altemose said. “You can begin to ask questions at a level of resolution that just wasn’t possible before, not even with short-read sequencing methods.”

Altemose plans to explore the centromeric regions further, using an improved technique he and colleagues at Stanford developed to pinpoint the sites on the chromosome that are bound by proteins, similar to how the kinetochore binds to the centromere. This technique, too, uses long-read sequencing technology. He and his group described the technique, called Directed Methylation with Long-read sequencing (DiMeLo-seq), in a paper that appeared this week in the journal Nature Methods.

Meanwhile, the T2T consortium is partnering with the Human PanGenome Reference Consortium to work toward a reference genome that represents all of humanity.

“Instead of just having one reference from one human individual or one hydatidiform mole, which isn’t even a real human individual, we should have a reference that represents everybody,” Altemose said. “There are various ideas about how to accomplish that. But what we need first is a grasp of what that variation looks like, and we need lots of high-quality individual genome sequences to accomplish that.”

His work on the centromeric regions, which he called “a passion project,” was funded by postdoctoral fellowships. The leaders of the T2T project were Karen Miga of UC Santa Cruz, Evan Eichler of the (function(d, s, id){ var js, fjs = d.getElementsByTagName(s)[0]; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "https://connect.facebook.net/en_US/sdk.js#xfbml=1&version=v2.6"; fjs.parentNode.insertBefore(js, fjs); }(document, 'script', 'facebook-jssdk'));

Read original article here