Mapping the fly
Unique project combines imaging and AI to forge atomic-level model
In a new study, Chan Zuckerberg Biohub Investigators Oren Rosenberg, Adam Frost, and Tanja Kortemme, all of UC San Francisco, and a team of more than 70 scientists married cryo-electron microscopy (cryoEM) with a leading-edge deep learning system to unveil the atomic structure of an important but little-understood SARS-CoV-2 protein.
In addition to its role in SARS-CoV-2, the protein in question, Nsp2, is present in other pathogenic coronaviruses involved in recent disease outbreaks. Though there is considerable sequence variation, Nsp2-like proteins are also found in SARS-CoV-1, which caused the SARS outbreak in Asia in the early 2000s, and in the MERS virus. The authors say the new structure may therefore offer insights into viral mechanisms that are broadly applicable to the coronavirus family, and may reveal targets for new antiviral drugs.
“In terms of proteins that help you understand something fundamental about coronaviruses, Nsp2 is on the list,” says co–senior author Frost. “Certainly in the case of SARS-CoV-2, this protein is evolving, and different variants of Nsp2 sequences are now showing up in different regions of the world. As far as correlating those sequence differences to pathogenicity, morbidity, or mortality for infected patients, we’re not there yet, but it’s a protein that does appear to be under selective pressure, and understanding why is important.”
The work, now posted on the bioRxiv preprint server, is part of a massive ongoing scientific effort by the QBI Coronavirus Research Group (QCRG), a rapid-response research initiative assembled in the early days of the COVID-19 pandemic by co–senior author Nevan Krogan, director of UCSF’s Quantitative Biosciences Institute. The paper is one of several published over the past year featuring contributions from QCRG’s Structural Biology Consortium.
The most novel aspect of the new paper is the group’s use of AlphaFold2, a protein-structure prediction tool announced to great fanfare by London-based AI lab DeepMind in 2020. With cryoEM alone, some smaller domains of Nsp2 couldn’t be mapped at sufficiently high resolution by the UCSF group, but after feeding data on these lower-resolution regions into AlphaFold2, the team soon solved the complete protein at atomic resolution.
“To our knowledge, this is the first time this sort of exercise has been carried out,” says co–senior author Kliment Verba of UCSF, and we really think this is the way a lot of structural biology is going to be done in the next decade: adding lower-resolution restraints from cryoEM to machine-learning models—which are already quite accurate with smaller domains—and putting them all together to make a complete model.”
Analysis of the final model showed that one Nsp2 region that is highly conserved, both across SARS-CoV-2 variants and across the other pathogenic coronaviruses, forms a so-called zinc ribbon motif. Though the precise role of this region in SARS-CoV-2 is still uncertain, this motif is present in many RNA binding proteins, and indirect evidence in the new study suggests that this region of Nsp2 binds nucleic acids in infected host cells.
The function of two other highly conserved regions were less obvious, so the researchers introduced mutations to explore this question. In one case, mutations blocked Nsp2 interactions with the WASH protein complex, which governs cellular machinery related to actin and endosomes. In the other, mutations abolished interactions with a protein called GIGYF2, which inhibits the initiation of mRNA translation and is also involved in quality control of translation at ribosomes.
“This is speculative,” says Frost, who has studied GIGYF2 in other contexts, “but these findings are potentially quite exciting as a window into how viruses bias translation to serve their own goals. Every virus ‘wants’ the infected cell to deprioritize the synthesis of host proteins in favor of making viral proteins, and viruses accomplish that in different ways. Interactions between Nsp2 and GIGYF2 may be how coronaviruses attenuate translation of host messages in favor of viral messages, but there’s more work we need to do to figure out the precise mechanism.”
Verba says that in the spring of 2020, scientists in QCRG’s Structural Biology Consortium were struck by how speedily and efficiently CZ Biohub stood up its CLIAhub COVID testing lab, and they wanted to quickly put their own skills to use to contribute to the pandemic response.
“When we were setting this up, we were very much inspired by volunteer-based testing efforts happening at the Biohub. I know a ton of people who would just clock in and put in a lot of hours doing testing at the CLIAhub. Our work is very akin to that, but instead of volunteers putting in hours setting up and testing samples, they would come in and grow bacteria or purify protein, or set up crystal trays. Although initiated by the faculty members, the remarkable thing was that this effort was largely driven by self-organized trainees at UCSF, selflessly committing hundreds of hours to this work,” says Verba.
Research scientist Michelle Moritz; postdoctoral scholars Meghna Gupta, Caleigh Azumaya, Amy Diallo and Greg Merz; and graduate student Sergei Pourmal, all of UCSF, led the structural biology research described in the paper, “and everyone on that list contributed in one way or another,” Verba says.
“The coolest part? It worked,” adds co–senior author Rosenberg. “And it’s an example for the future that’s very much in the spirit of the Biohub. Why don’t we all do more work like this: doing great science together because it’s a good thing to do, not to get papers? If people saw how fun it is to work like this they’d do it more often.”