Biotech firm aims to create 'ChatGPT of biology' – will it work?
A UK biotech firm, Basecamp Research, has been on a mission to gather genetic data from microbes thriving in extreme environments worldwide. Their efforts have led them to uncover over a million previously unknown microbial species and nearly 10 billion genes that are new to the world of science.
The company envisions using this vast database of genetic information to train a "ChatGPT of biology" – an AI biologist that can provide insights into life on Earth. However, the question remains: will this ambitious endeavor succeed?
Uncovering Genetic Diversity
Basecamp's genetic data collection spans more than 120 sites across 26 countries, with a focus on sampling from unexplored extreme environments, including frigid Arctic waters and hot jungle springs. The samples primarily consist of prokaryotic organisms like bacteria, microbes, and viruses, with a sprinkle of fungi.
Genetic analysis of these samples has unveiled a wealth of information, showcasing differences in genes shared across various life forms. The company estimates that their data includes details from over a million species not present in publicly available genomic datasets. This trove encompasses approximately 9.8 billion newly identified genes, a significant expansion in the known genetic landscape.
The Promise of Enhanced Biological Understanding
With their extensive collection of genetic data, Basecamp aims to enhance the capabilities of AI models in understanding biological processes. By exposing these models to a broader spectrum of nature, the company hopes to facilitate a deeper comprehension of biology, akin to building a "ChatGPT of biology."
While Earth might host a staggering number of microbial species, many of them remain unexplored. Basecamp's discovery of numerous novel genes and species reflects the untapped biodiversity present on our planet.
Challenges and Skepticism
Despite the excitement surrounding Basecamp's findings, some experts remain cautious about the potential outcomes. The novelty of the new genetic material raises questions about its practical utility, particularly in areas like drug discovery or biotechnology.
Additionally, the sheer diversity of these newly identified genes poses challenges in predicting their functions accurately. Without a comprehensive understanding of these genes' roles, training AI models using this data may prove to be a daunting task.
While Basecamp's initiative holds promise for unlocking new realms of biology, the road ahead may require a combination of advanced AI technologies and traditional laboratory work to decipher the mysteries hidden within this genetic treasure trove.










