Sitemap

Making Drugs With AI: How Machine Learning Can Design Better Antibodies Than Traditional Methods

3 min readJun 4, 2025

Each $10,000 experiment can make or break a pharmaceutical company’s antibody development program, yet the vast majority fail to improve upon promising drug candidates. CDS PhD graduate and now Principal Machine Learning Scientist at Genentech Samuel Stanton and his colleagues at Genentech have created a new system that transforms this expensive gamble into a data-driven process, successfully engineering antibodies with 3 to 100 times better binding strength than initial lead molecules across four therapeutic targets.

Their “Lab-in-the-loop” system, described in a paper titled “Lab-in-the-loop therapeutic antibody design with deep learning,” combines machine learning models with laboratory experiments in iterative cycles. The work was conducted with CDS Professor Kyunghyun Cho, former CDS co-Director Richard Bonneau, CDS PhD graduate Omar Mahmood, and many others. Over 1,800 unique antibody variants were designed and tested, starting from promising leads obtained through animal immunization campaigns.

The breakthrough addresses a fundamental challenge in drug development: optimizing multiple competing properties simultaneously. “It’s not just about making something stick better,” Stanton said. “For a molecule to become a drug, there’s potency, safety, and developability all to consider.”

The process begins with antibody sequences from immunized animals, which serve as starting points for machine learning models to generate thousands of variants. These designs are ranked by algorithms that estimate binding affinity, expression levels, and safety issues. The most promising candidates undergo laboratory testing using surface plasmon resonance, the gold standard for measuring antibody binding.

What sets this approach apart is its embrace of iterative learning from failure. “The first round was terrible. We got basically no binders,” Stanton recalled. “In the very beginning, we even had some trouble expressing and purifying the antibodies proposed by our system. .”

The team’s patience proved crucial. By the fourth round, the system was generating libraries with 99% expression and where more than 26% of designs showed at least three-fold binding improvement over their starting molecules. Some achieved 100-fold improvements, reaching therapeutically relevant binding strengths (technically, kinetic dissociation constants) in the range of hundreds of picomolar.

The team solved crystal structures of both starting antibodies and improved designs to understand the molecular basis for enhanced binding. Rather than making dramatic structural changes, the AI-designed antibodies preserved the overall shape of binding regions while introducing strategic mutations that formed new stabilizing interactions.

This work builds on earlier efforts, including influential papers from David Gifford’s lab at MIT (“Antibody Complementarity Determining Region Design Using High-Capacity Machine Learning”) and Frances Arnold’s lab at Caltech (“Machine-learning-guided directed evolution for protein engineering”). However, those studies used noisier, high-throughput experimental setups and didn’t address the multiple constraints required for therapeutic development.

The challenge of implementing such systems goes beyond technical hurdles. When experiments cost $10,000 each and early results show no improvement, maintaining confidence requires substantial organizational commitment. “Black-box optimization is really heralded as a solution for problems where data collection is very expensive,” Stanton explained. “But when your annotation costs are that high, people are very reluctant to allow what looks like waste.”

The team tested their approach on antibodies targeting EGFR, IL-6, HER2, and OSM — proteins involved in cancer, inflammation, and autoimmune diseases. All showed significant improvements, suggesting broad applicability across therapeutic areas.

The views of Samuel Stanton represented here are his personal views and do not necessarily represent his employer.

By Stephen Thomas

--

--

NYU Center for Data Science
NYU Center for Data Science

Written by NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.

Responses (1)