A scientist I know has a tattoo on his arm of a sketch from Darwin’s notebooks. Spidery lines radiate outward from the center, branching and splitting again and again, ending with labels: A, B, C, and D. The diagram is simple, but the idea is profound: a tree of life, with lines of descent that connect you and me to everything that lives and has ever lived on Earth.
If I had to pick a single image to show how evolutionary biologists like myself think about what we do, my choice would be the same. Evolutionary trees are narratives of history written in genetic code, records of family lineages that stretch into the distant past. For anyone who knows how to look, the past, present, and possible futures of the new coronavirus can be found in its evolutionary tree. We are uncovering the tree now, bit by bit, and, with the help of new technologies that make it cheaper and faster to sequence viral genomes, we are reading the virus’s tree faster than we have in any epidemic before. The question now is: Can we read it fast enough to make a difference as we race to limit the virus’s spread?
In the earliest days of the epidemic, the virus’s evolutionary tree hinted at the seriousness of the problem we would face. On January 10th, ten days after the world first heard about a cluster of mysterious pneumonias in Wuhan, told us something important: the virus responsible was new to humankind. Its symptoms resemble those of influenza, pneumonia, and the common cold, and the virus is in the coronavirus family, as the name COVIDhg0088-19 suggests. But the virus has not infected humans before, which means that we have limited immune defenses to stop its spread.
A second important finding came soon after the first. Early reports from the World Health Organization had suggested that the virus had jumped multiple times from its animal host into humans, with limited person-to-person spread. But as more viral genomes were sequenced, in mid-January, their evolutionary tree told a different story. When researchers compared genomes from the COVID-19 outbreak to coronaviruses found in bats, pangolins, and other animals, they found that the viruses from the outbreak in a single branch of the tree, recent descendants from a in November or December. We still don’t know the of this spillover event. But, as we now know too well, the story the tree told was correct: the new virus spreads easily from person to person.
In late January, a man flew from Wuhan to Washington State and got sick, becoming the of COVID-19 in the United States. For weeks afterward, we learned of similar cases, instances in which the virus was acquired abroad that seemed, at the time, to be isolated offshoots. That changed at the end of February, when researchers in Seattle identified a patient who got COVID-19 without travelling abroad or knowing someone who did. (Several of those researchers are my collaborators and mentors, and this article draws heavily from their work, particularly that of the virologist .)
The genomes of the Washington cases told a startling story: the viruses from the Wuhan traveller and the community-transmission case , the second virus a direct descendant of the first. Three mutations had arisen along the transmission chain to distinguish the two viruses, ticks on a genetic clock marking the time that elapsed. The viral genomes showed that after the coronavirus reached Washington State, in late January, it grew into its own branch of the tree, spreading silently through the city for weeks. Based on the cryptic transmission evident from the evolutionary tree, that the virus might have infected five hundred to six hundred people in Washington State by early March, far more than the eighteen cases reported at the time.
hg0088In the month since we learned that the virus was spreading in Seattle, , its tips splitting and branching as the virus continues to spread. In Washington State, nearly five thousand new cases were diagnosed in March, most of them . This branch has grown thick enough to seed its own outbreaks. We have found its descendants in New York, California, Connecticut, Minnesota, and Wisconsin, some of the few states to publish viral genome sequences so far. The Washington branch has also sparked clusters of cases as far away as Iceland and Australia, a testament to our interconnected world. (As each of these branches has spread, they have developed different mutations, a normal consequence of mistakes in viral replication. So far, there is no evidence that these mutations change how the virus infects people.)
hg0088Across the rest of the coronavirus tree, . In the United States, the tree shows that the coronavirus crossed our borders and spread multiple times. Seattle’s outbreak is fuelled by several distinct branches, evidence that the virus has taken multiple paths from Wuhan to Washington State. belong to a single branch whose closest relatives are in Europe, not China, suggesting that the virus crossed the Atlantic rather than the Pacific Ocean to arrive on the East Coast. Across the United States, the virus has continued to replicate nationwide, in defiance of the patchwork restrictions put forth to stop it.
In the Ebola and Zika outbreaks of the past several years, evolutionary trees revealed the virus’s patterns of spread around the world, but sometimes not until the outbreaks began. The new coronavirus has spread far faster—but . Between January and early April, more than twenty-five hundred COVIDhg0088-19 genomes were published, making it possible to track how the virus has spread and evolved in almost real time.
These advances raise the tantalizing possibility that knowledge of viral evolution can alter the course of this pandemic. It may already have made a difference in Seattle, where genome sequencing of the first two known Washington cases alerted researchers to weeks of silent viral spread, an insight impossible from positive tests alone. It was, in part, these clues about a larger, hidden outbreak that prompted swift social-distancing measures, long before test results began to catch up.
hg0088For now, during this period of extreme social distancing, a major priority is to implement to learn where and how quickly the virus is spreading. Eventually, may tell us who should stay home and who might safely go back to work, letting the economy restart. But the case of Seattle points toward a more distant future routine genome sequencing can also . Someday, genome sequencing may help us figure out who transmits the virus to whom. It might teach us how often the virus spreads through close contact, as opposed to passing encounters. And once social distancing lets up, expanded diagnostic testing paired with genome sequencing can alert us to the virus’s likely resurgences.