Biohub puts $500 million behind AI biology push with MIT, Harvard, and NVIDIA
The five-year Virtual Biology Initiative brings together the Broad Institute, Allen Institute, Arc Institute, Wellcome Sanger Institute, Human Cell Atlas, Human Protein Atlas, NVIDIA, and Renaissance Philanthropy in a coordinated global effort to generate the data needed for predictive models of the human cell.
Biohub has committed $500 million to the Virtual Biology Initiative, a global effort to build AI-powered predictive models of the human cell.
Biohub, the biomedical research organization co-founded by Priscilla Chan and Mark Zuckerberg, has committed $500 million over five years to generate the open datasets needed to build AI-powered predictive models of the human cell.
The Virtual Biology Initiative, announced on April 29, brings together the Broad Institute of MIT and Harvard, the Wellcome Sanger Institute, the Allen Institute, Arc Institute, NVIDIA, and international consortia including the Human Cell Atlas and the Human Protein Atlas.
Priscilla Chan, Co-Founder of Biohub and Co-CEO of the Chan Zuckerberg Initiative, shared the announcement on LinkedIn, writing that the initiative represents "a major new commitment to help build the data foundation biology needs for its next leap."
Of the $500 million, $100 million will fund external research to support a coordinated global data-generation effort, while $400 million will go toward internal technology development at Biohub, including next-generation imaging, molecular engineering, and data infrastructure.
Why the data gap matters
Alex Rives, Head of Science at Biohub, says: "To build artificial intelligence that can accurately represent the full complexity of biology and accelerate scientific research, we need orders of magnitude more data than exists today. We need new technologies to observe the cell, from the molecular to the tissue level, and in the context of health and disease."
Rives, who previously served as Chief Scientist at EvolutionaryScale and as a research scientist at Facebook AI, posted on LinkedIn that the initiative aims to "generate the data to unlock scaling laws in biology and build accurate predictive models of the cell." He added that Biohub is "contributing to this effort as both a funder and a builder."
The initiative builds on Biohub's existing work on large-scale data projects including CELLxGENE, the Tabula Sapiens multi-organ cell atlas, and the Billion Cells Project, which coordinates 17 projects across institutions including MIT, Stanford, UC San Francisco, Columbia, and ETH Zurich.
Global coalition spans five countries
The partner institutions each bring distinct capabilities. The Broad Institute has been central to developing single-cell RNA sequencing and spatial transcriptomics. The Wellcome Sanger Institute contributed to the Human Genome Project and is a leading force in planetary and disease genomics. The Allen Institute will contribute datasets and biological models, while Arc Institute brings its work on genome engineering and virtual cell modeling.
Eric S. Lander, Founding Director of the Broad Institute, says: "Fully deciphering the logic of cells is a huge challenge, but it has the potential to transform medicine. And, it's a challenge that will once again take many groups and perspectives collaborating together."
The Human Cell Atlas, which has grown to more than 3,900 members across over 1,700 institutions in more than 100 countries, will expand its cell atlases using next-generation spatial omics technologies. The Human Protein Atlas, based in Sweden and running since 2003, will add proteomic and imaging data to the effort.
Emma Lundberg, Co-Director of the Human Protein Atlas at Stanford University and KTH, states: "A global coordinated data foundation for modern AI-powered biology is exactly what we need to break siloes and accelerate progress towards high-fidelity simulators of biology."
NVIDIA will provide accelerated computing infrastructure, domain-specific software, and technical expertise to help process and analyze the datasets.
Open data as the operating principle
All data generated by Biohub through the initiative will be made openly and freely available to the global scientific community. Chan emphasized this point on LinkedIn, noting that "most biological data is siloed, proprietary, and configured to answer one specific question" and that the field needs "foundational datasets that are open, standardized, and built for everyone to use and build upon."
Renaissance Philanthropy has joined the initiative to help expand the funding base. Biohub says additional funders, research institutions, and partners are expected to participate, with the organization describing the current commitment as the starting point for a much larger global effort.