Hard Fork

Where Is All the A.I.-Driven Scientific Progress?

December 26, 2025

Key Takeaways Copied to clipboard!

  • Sam Rodriguez's AI scientist, Kosmos, can accomplish work equivalent to six months of doctoral or postdoctoral research in a single 12-hour run by leveraging multiple large language models and a structured world model for complex task orchestration. 
  • The primary bottlenecks for AI-accelerated medical breakthroughs are not in the initial discovery phase (which tools like Kosmos are addressing), but in the slow, expensive, and regulated processes of manufacturing, patient recruitment, and clinical trials. 
  • The current landscape of AI in science is split between modeling the natural world (e.g., protein folding) and modeling the process of science itself (e.g., AI agents like Kosmos), with the latter being a major focus for accelerating research iteration. 

Segments

Introduction to AI Science Hype
Copied to clipboard!
(00:00:00)
  • Key Takeaway: AI lab leaders frequently cite curing diseases and solving climate change as justifications for their work, often as a response to criticism of current AI models.
  • Summary: The episode frames the discussion around the ambitious claims made by major AI lab leaders regarding AI’s imminent impact on scientific discovery. Hosts note that these claims often serve as a counter-narrative when AI models face public scrutiny. The White House Genesis mission was announced to coordinate national efforts toward AI-accelerated innovation.
Introducing Guest and Kosmos
Copied to clipboard!
(00:03:37)
  • Key Takeaway: Sam Rodriguez, CEO of FutureHouse and Edison Scientific, is introduced as an expert with a PhD in physics and experience running an applied biotech lab.
  • Summary: Sam Rodriguez is presented as the expert guest to evaluate the reality versus the hype in AI science. His company, Edison Scientific, developed Kosmos, an AI agent designed to function as an ‘AI scientist.’ Rodriguez is positioned as optimistic but more skeptical than the most extreme futurists.
Kosmos Performance Metrics
Copied to clipboard!
(00:07:47)
  • Key Takeaway: Kosmos’s claim of accomplishing six months of doctoral/postdoctoral research in 12 hours was validated by comparing its overnight findings against work that took human collaborators three to six months to achieve.
  • Summary: The six-month research equivalence claim for Kosmos was measured by having academic collaborators attempt to replicate its overnight discoveries using the same data sets. Kosmos functions as an agent that runs for about 12 hours, returning deep insights, which are correct about 80% of the time.
Kosmos Architecture and Cost
Copied to clipboard!
(00:09:52)
  • Key Takeaway: Kosmos is built by layering atop models from OpenAI, Google, and Anthropic, augmented by proprietary models, with its key innovation being a ‘structured world model’ that maintains task coherence over long runs.
  • Summary: Kosmos is not a simple chatbot; it runs for extended periods (around 12 hours) to complete complex objectives. The high cost ($200 per prompt) is attributed to the massive compute required, including writing 42,000 lines of code and reading 1,500 research papers on average per run.
Novel Discoveries and Validation Process
Copied to clipboard!
(00:13:08)
  • Key Takeaway: Kosmos has already generated four net new scientific contributions, including identifying a novel mechanism linking a genetic variant to insulin secretion in Type 2 diabetes.
  • Summary: The AI is already making novel discoveries, not just replicating existing findings, as evidenced by its work on genetic variants. Any discovery made by Kosmos requires subsequent manual validation and experimentation by human scientists before it can be acted upon.
Bottlenecks in Scientific Translation
Copied to clipboard!
(00:16:20)
  • Key Takeaway: The major bottleneck preventing rapid cures is the slow, multi-year process of clinical trials, drug manufacturing, patient recruitment, and regulatory approval, not the speed of initial discovery.
  • Summary: While AI can optimize the planning of experiments, the physical and regulatory steps required to test drugs in humans inherently take a long time. AI’s immediate value is in ensuring that the expensive, slow experiments (like clinical trials) are optimally planned based on the maximum available knowledge.
AI Adoption in Working Science
Copied to clipboard!
(00:34:10)
  • Key Takeaway: AI tools have not yet significantly changed the workflow for most working scientists, especially in biology, due to the conservative nature of experimental protocols, but adoption is accelerating in coding and literature search.
  • Summary: Biologists tend to adopt new methods slowly because they rely on established, working protocols, even if those protocols are not fully understood. Coding assistance and literature search are seeing rapid adoption because they address historical bottlenecks in those specific scientific tasks.
Overhyped vs. Underhyped Game
Copied to clipboard!
(00:35:37)
  • Key Takeaway: Brain-computer interfaces (BCIs) and quantum computing are currently considered overhyped relative to their near-term utility, while generative design in biology (like AlphaFold 3) is potentially underhyped.
  • Summary: Vibe proving math proofs is deemed overhyped for direct utility but useful as an AI benchmark driver. Robotics for lab automation is appropriately hyped as transformative but technologically immature. AlphaFold 3 is suggested to be underhyped because its transformative potential in biology is immense.
Top 2025 Advancements and Future Outlook
Copied to clipboard!
(00:38:25)
  • Key Takeaway: The year 2025 has been defined by the rise of AI agents in science, with predictions that by 2027, the majority of high-quality scientific hypotheses could be generated by these agents.
  • Summary: The top three advancements of the year were AI agents for science, de novo antibody design (which drastically cuts down pre-clinical development time), and the generation of novel organisms by groups like the Arc Institute. The expectation is that AI agents will continue to explode in capability and infiltration throughout 2026.