At the intersection of machine learning and life science
At the intersection of machine learning and life science
In September of 2008, a young woman living in a suburb of Lusaka, Zambia’s capital, was stricken by a mysterious illness driven by an unknown virus. As headache, fever, and body aches gave way to more frightening symptoms including seizures and facial swelling, the woman was admitted to a local hospital before being transferred to a facility in Johannesburg, South Africa, some 740 miles away.
Her symptoms did not improve in Johannesburg, and healthcare workers from the two hospitals soon began to suffer the same symptoms, which progressed further to brain swelling and organ failure. By mid-October, the virus — dubbed “Lujo” for the two cities it now affected — had killed four people and left a fifth with severe medical complications.
And then, it simply disappeared.
“We don’t know where it came from, we don’t know where it is, and we don’t know when it might come back,” says Samuel Munjita of the University of Zambia, who’s searching for the origins of Lujo and related viruses in rodent populations in and around Lusaka. “I want to figure out how a virus like this could pass from animals to humans. If you know how that happens it makes it much easier to prepare for outbreaks.”
Though alarming, the Lujo virus story is not unusual. Novel pathogens often show up seemingly out of nowhere. So how can public health officials rapidly discern the origins of a novel infectious disease, and perhaps prevent future outbreaks?
Having heard about Munjita’s work, members of the Chan Zuckerberg Biohub’s Rapid Response Group, which develops strategies to tackle emerging pathogens, invited him and two Zambian colleagues working on other infectious diseases to visit CZ Biohub last April (see photo above) for a two-week workshop on metagenomic next-generation sequencing (mNGS), a powerful technique that can rapidly and simultaneously detect any potential pathogens—viruses, bacteria, fungi, or parasites—in a biological sample.
The team also learned to explore their genomic data using CZ ID, a free, cloud-based technology jointly created by the Biohub and the Chan Zuckerberg Initiative (CZI) that helps researchers detect and track infectious diseases. CZ ID analyzes uploaded mNGS data to precisely identify the pathogens in a particular sample based on their distinctive genomic signatures.
Although some viruses, like Lujo, vanish after a short period wreaking havoc, others, such as HIV, Ebola, and SARS-CoV-2, change the world forever. Yet conventional tests often fail to pinpoint the causes of disease. For example, some studies have estimated that in a majority of cases of patients hospitalized with encephalitis, a rare and dangerous swelling of the brain, physicians never discover the actual root of the illness.
This stems from how conventional tests are set up — physicians must use a patient’s symptoms to predict their illness, then test for a “yes” or “no.” If a physician’s prediction is incorrect, they are left without answers.
On the other hand, mNGS reads every piece of DNA or RNA present in a sample, meaning researchers are alerted to every potential pathogen present — even pathogens that haven’t been encountered before. And with technologies like CZ ID, instead of playing “20 Questions” when diagnosing novel diseases, researchers and physicians are able to make one straightforward, unbiased inquiry: What do we have here?
Like all Biohub and CZI platforms, CZ ID, which grew out of pioneering work by Biohub President and “disease detective” Joe DeRisi, is open source and open access. With a user-friendly interface, CZ ID allows anyone to upload sequencing data to cloud-based servers to be compared against petabytes of public genomic data for identification. Users get their results quickly, and everyone benefits from sharing DNA reads — the more there are, the more likely a match will be found to tell doctors and researchers what’s in their samples.
The implications of introducing this technology globally are profound. Through joint funding from CZI and the Bill & Melinda Gates Foundation’s Grand Challenges grant program, several other teams—including Cambodia, Nepal, and Madagascar—have also come to San Francisco to work with Biohub scientists on implementing mNGS and CZ ID in their home countries.
In addition to working with grantees from the Gates Foundation and CZI, the Biohub “also collaborates directly with organizations such as the Africa CDC, the NIH, and other academic institutions to provide training to researchers working in academic and public health laboratories within resource-constrained regions of the world,” says Cristina Tato, director of Rapid Response at the Biohub. “We strive to help build a global network for data sharing around incidence of infectious disease.”
For example, before the COVID-19 pandemic, members of the Biohub’s Rapid Response team visited four countries for onsite mentorship and scholarly exchange. Through these mentorship efforts, CZ ID has been put in place in 14 countries, with additional locations on the near horizon. All told, there are CZ ID users in 76 countries, 47 of which are low- or middle-income countries.
“The main goal is to support early detection of outbreaks and to discover the unknown causes of diseases using state-of-the-art technology,” says Vida Ahyong, a senior scientist and CZ Biohub Fellow who has managed training and support for program participants from countries such as Cambodia, Madagascar, and Malawi. “It’s just amazing to work with these teams from different countries and cultures who are all working towards that common goal.”
After heading home, the in-country teams receive continued support from Biohub and CZI scientists and software engineers. The rapid analysis made possible by CZ ID provides actionable information to shape disease surveillance and control efforts, giving public health officials early insight on emerging diseases before they take hold.
With training and support from the Biohub, grantees are now able to apply mNGS to long-standing medical mysteries.
At Wits Health Consortium in South Africa, Biohub-trained researchers are following up on recent evidence that pathogenic infections are responsible for far more neonatal deaths and stillbirths than was previously thought, particularly in low- and middle-income countries. At the Pasteur Institute of Madagascar, Biohub-trained scientists are analyzing samples from patients with fever of unknown origin as well as from bats, a suspected major source of new human diseases. And at Kathmandu University in Nepal, scientists are working to find the cause of seasonal hyperacute panuveitis, an inflammatory disease of the eye that predominantly affects children, potentially causing blindness, which, for reasons unknown, strikes hardest at the end of Nepal’s monsoon season.
Various Biohub-supported teams are also using metagenomics to try to track the rise of microbial antibiotic resistance in their regions, a problem the World Health Organization cites as “one of the biggest threats to global health, food security, and development today.” The training and analysis tools provided by the Biohub allowed participants to quickly pivot and provide support for their nations’ responses to the COVID-19 pandemic.
After mastering mNGS and CZ ID, some of the grantees have gone on to create their own metagenomics training opportunities at home. In Nepal, Biohub trainees Rajeev Shrestha and Nishan Katawul of Kathmandu University’s Dhulikhel Hospital have already offered sequencing mentorship to dozens of their colleagues and students and hope to create more formalized training pipelines soon. “We want to develop our own structured course with the help of Biohub,” says Shrestha. “It’s important to train more people and increase our overall capacity in genomic surveillance.”
Imran Nisar and colleagues at Aga Khan University in Karachi, Pakistan are using what they’ve learned through Biohub mentorship to build a sequencing core — the first of its kind at that institution — which will serve as a regional reference center for disease surveillance.
Tato says that such efforts are a measure of the success of a “training the trainers” — the philosophy is a linchpin of the Biohub’s approach. “The mentorship and analytical tools we provide — putting data generation and analysis in the hands of the researchers that need real-time insight into what is causing disease in their states — have proven to be transformational for these groups, whether it is during a hospital outbreak, a seasonal epidemic, or preparing for the next pandemic.”
At the intersection of machine learning and life science
Learn More
Piecing together new technologies for biomedicine
Learn More
Bolivian biologists building a better future for science and health
Learn More
Stay up-to-date on the latest news, publications, competitions, and stories from CZ Biohub.
Marketing cookies are required to access this form.