AI-Driven Drug Discovery: From Concept to Execution
The field crossed a critical threshold in June 2025: the first peer-reviewed Phase IIa results for a molecule where both target and compound were discovered entirely by generative AI. Insilico Medicine’s rentosertib, a TNIK inhibitor for idiopathic pulmonary fibrosis, showed patients gaining lung function versus a decline in the placebo group. Publication in Nature Medicine marked what the industry had been waiting for: proof that AI-designed molecules can demonstrate efficacy in humans. And Rentosertib is not alone: 31 AI-discovered assets are now in clinical phases globally.
AlphaFold 3, released in May 2024, models entire molecular complexes with 50%+ improvement over prior methods, revealing how drugs bind to proteins, DNA, RNA, and ligands. This accelerates structure-based design by clarifying where and how a molecule might interact with its target.
But the larger speed gains come from generative chemistry platforms: systems that propose novel candidates computationally, predict ADMET properties (absorption, distribution, metabolism, excretion, toxicity), and optimise for synthesisability before compounds reach a wet lab. Hit identification (finding initial compounds with activity) and lead optimisation (refining for potency and drug-like properties) now compress from 4 to 5 years to 12 to 18 months.
Two paradigms coexist:
- Wet lab to models uses experimental data to train AI systems: phenomic screens (observe how cells change morphology and behaviour when exposed to compounds, capturing complex biological responses without requiring mechanistic knowledge) & high-throughput assays (testing thousands to millions of compounds against specific targets in automated fashion).
- Models to wet lab generates candidates computationally and validates them experimentally.
The most powerful platforms combine both, and the relationship becomes circular: Recursion’s merger with Exscientia integrates screening with generative chemistry into closed-loop systems that iterate faster than either approach alone.
Yet bridging AI predictions and wet-lab validation remains hard: models can hallucinate compounds that look elegant on screen but cannot be synthesised or prove too toxic. A deeper problem is data bias, not just scarcity. The protein structures available for training come mostly from techniques that capture molecules frozen in their most stable configurations. But proteins are dynamic: they flex, twist, and briefly expose binding pockets that drugs need to reach. Models trained predominantly on static snapshots may miss the transient shapes that matter most for therapeutic intervention. Similar gaps exist at the subcellular level: most assays treat cells as monolithic units, missing how drugs interact with specific organelles (lysosomes, reticulum, mitochondria, etc.) where therapeutic action often occurs — our portfolio company Oria Bioscience is addressing this by providing isolated organelles for more precise screening.
The economics explain the stakes
Typical preclinical programmes require $430 million in out-of-pocket expenses (over $1 billion capitalised) across 3 to 6 years (DiMasi et al., 2016). Insilico Medicine’s rentosertib programme reached preclinical candidate in 18 months for $3 to $5 million per programme, a ~99% cost reduction and 70%+ time compression (GEN Edge, June 2025). From 2021 to 2024, the company nominated 22 preclinical candidates averaging 12 to 18 months per programme with only 60 to 200 molecules synthesised per project.
Drug assets appreciate exponentially through clinical phases: typical valuations rise from ~$45M at Phase I entry, to ~$250M at Phase II, to over $1B at Phase III, and $2 to 4B at approval (BayBridge Bio; Therapeutic Innovation & Regulatory Science, 2022). But success probabilities compound harshly: 69% from preclinical to Phase I, 54% Phase I to II, 34% Phase II to III (Paul et al., 2010). A molecule entering Phase I has roughly 13% cumulative odds of reaching approval, and more recent data suggests this has dropped to ~7% (Citeline / Norstella, 2024).
Platforms that compress discovery timelines and reduce cost per asset allow more shots on goal at each stage, systematically improving expected value across portfolios.