In the ability of the virus, the proteins of the SARS Cov‑2 virus play a key role in tricking human immune defenses and replicating in patient cells. An international research team with the participation of the Technical University of Munich (TUM) has now compiled the most comprehensive and detailed overview of all 3D structures of the virus proteins available worldwide to date. The evaluation with artificial intelligence methods revealed surprising findings.
How does the SARS-CoV‑2 virus manage to evade immune defenses and replicate in the cells of patients? To answer this question, an international research team has assembled the most comprehensive overview of any analysis of the exact three-dimensional shape of SARS-CoV‑2 proteins — including the well-known spike protein — available to date.
To compile this overview, the team used high-throughput machine learning. This approach makes it possible to predict structural states of coronavirus proteins based on analyses of related proteins. The database now consists of 2,060 3D models with atomic resolution. All structural models are freely available on the Aquaria-COVID website (https://aquaria.ws/covid).
“This provides an unprecedented level of detail that will help researchers better understand the molecular mechanisms of COVID-19 infection and develop therapies to combat the pandemic, for example by identifying potential new targets for future treatments or vaccines.”
- Burkhard Rost, Chair of Bioinformatics at the TU Munich
The structural map unlocks the compiled knowledge
In a second part of the study, a complementary approach known as human-in-the-loop machine learning was used. Here, a novel visual interface was generated that summarizes everything that is currently known about the three-dimensional shape of SARS-CoV‑2 proteins — and what is not.
Researchers can also use the visual interface as a navigation tool to find appropriate structural models for specific research questions. Work with the models has already provided some important clues about how coronaviruses manage to take command in our cells.
How coronaviruses manage to take command in our cells
Using machine learning algorithms, the team identified three coronavirus proteins (NSP3, NSP13, and NSP16) that “mimic” human proteins and successfully fool host cells into thinking they are endogenous proteins working in the best interest of the cell.
Modeling also revealed five coronavirus proteins (NSP1, NSP3, spike glycoprotein, envelope protein, and ORF9b protein) that “misappropriate” or disrupt processes in human cells. In this way, the virus manages to take control, complete its life cycle and spread.
Understanding how the virus works — and how to stop it
“In analyzing these structural models, we also found new clues about how the virus copies its own genome — which is the key process that enables the virus to spread rapidly in infected individuals,” says Burkhard Rost. “The findings from our study bring us closer to understanding how the virus works and what we can do to stop it.”
“The longer the virus circulates, the greater the risk that it will mutate and form new variants like the delta strain,” says Sean O’Donoghue, lead author of the study and a professor at the Garvan Institute in Sydney. “Our resource will help researchers understand how new strains of the virus differ from each other — a piece of the puzzle we hope will help combat emerging variants.”