Scientists have learned a lot about evolution by studying fossils, by observing nature and, more recently, by unraveling the genetic code stored in DNA.
Now, a team of Stanford computer scientists and biologists has looked at evolution through a new lens, by analyzing how proteins — the biological machines produced by DNA — evolve to sustain the network of molecular interactions upon which all life depends.
The scientists studied 1,840 species — from bacteria to primates — to understand how evolution built life forms that could survive in the face of natural adversities. What they discovered was profound yet intuitive: Every species has evolved backup plans that allow its protein machinery to find bypasses and workarounds when nature tries to gum up the works. No previous study has ever surveyed such a broad swath of species to find a survival strategy common to all life: Develop a versatile and robust molecular machinery.
“Across our entire sample, we find that the resilience of a species is strongly correlated with having protein networks that are robust to failure and can interact in multiple ways to preserve life,” said Stanford computer scientist Jure Leskovec, senior author of the paper that appears today in Proceedings of the National Academy of Sciences.
Evolutionary biologist Marcus Feldman, a co-author on the paper, said this is the most ambitious effort yet to understand what scientists call the interactome — the sum total of all the protein interactions for each species, just as genome describes the sum total of a species’ DNA. “We’re looking at the mechanism of evolution on an unprecedented scale, using the tools of data science to study the structure of the protein networks that make life possible,” Feldman said.
To conduct the study, Stanford postdoctoral scholar Marinka Zitnik built the database of 1,840 organisms and collected data on 9 million protein interactions. The team’s premise was that natural selection had already identified these organisms as fit to survive. By asking the right questions, and developing the right analytical techniques, they looked for patterns in the data to help reveal the principles of interactome evolution.
The researchers wanted to understand how protein machines deal with the unexpected. So, they ran a series of data science experiments to disrupt the protein networks that sustain life. In a computational analysis, they knocked out a certain percentage of each organisms’ proteins at random. They did this systematically for all 1,840 species, constantly computing whether some sort of backup system would allow the protein networks to continue to function in a way that would support life, until at some point the disruptions caused the protein machinery to fall apart.
Leskovec likened this analytical approach to throwing a sheet of glass against the ground and counting how many pieces it breaks into. If only some small pieces of the glass break away, this indicates a high degree of resilience. Similarly, if an organism’s protein networks remain largely intact even when some proteins are removed, this suggests that the organism is resilient. The study showed that organisms stave off collapse through all manner of backup and workaround mechanisms, revealed by the ability of their protein networks to maintain system integrity.
The researchers corroborated this notion of network resilience in a second way. They used this shattering technique to compare species over time. Based on fossil record and DNA studies, scientists know roughly the order in which various life forms in the sample appeared in evolution. If protein network resilience confers an evolutionary advantage, the researchers hypothesized, later-evolved organisms should have networks that are more shatterproof than preceding life forms. This is exactly what they found.
“Over billions of years evolution has worked to make protein networks more resilient against failure,” Leskovec said.
Leskovec believes that by studying the genome and interactome together, data scientists can better understand how evolution works. Information about how organisms are built and improved over time is stored in the genome. But as this study shows, the interactome is important to evolution, too: DNA creates and regulates protein networks, which develop backup processes to adapt to changing circumstances. In some cases, these adaptations prove so useful to a species that its genome preserves these protein improvements so they can be inherited.
“Genes can’t explain it all,” he said. “We can gain deep insights into many features of organisms by exploring quantitative properties of proteins and the computational patterns of networks of their interactions.”
Marcus Feldman is the Burnet C. and Mildred Finley Wohlford Professor in the School of Humanities and Sciences, a co-director of Stanford’s Center for Computational, Evolutionary and Human Genomics and director of the Morrison Institute for Population and Resource Studies. Marinka Zitnik is a postdoctoral scholar in the Department of Computer Science. Senior research engineer Rok Sosic also co-authored this study.
This research was supported by the National Science Foundation, the National Institutes of Health, DARPA, Boeing, the Stanford Data Science Initiative, the Chan Zuckerberg Biohub, the Stanford Center for Computational, Evolutionary and Human Genomics, and the John Templeton Foundation.