AI Language Model Generates Functional Antibodies Against Diverse Viral Targets

Researchers have developed an artificial intelligence model capable of generating novel antibody sequences specific to a range of viral antigens using only the target's amino acid sequence. The model, named MAGE (Monoclonal Antibody Generator), was designed to produce paired heavy and light chain antibody sequences without relying on template antibodies or structural input.
The study, led by scientists at Vanderbilt University Medical Center and collaborators, involved fine-tuning the protein language model Progen2 on over 18,000 antibody-antigen sequence pairs sourced from public databases and high-throughput techniques such as LIBRA-seq. The resulting model was trained to identify sequence features associated with binding specificity and to generate full-length human antibody sequences in response to antigen prompts.
To assess performance, MAGE was prompted with sequences from three viral targets: the SARS-CoV-2 receptor binding domain (RBD), the prefusion F protein of respiratory syncytial virus A (RSV-A), and the hemagglutinin (HA) of a recently emerged H5N1 avian influenza strain (A/Texas/37/2024). The targets were selected based on varying degrees of representation in the training dataset.
Experimental validation showed that, of 20 antibodies generated against the SARS-CoV-2 RBD, 9 demonstrated binding in ELISA assays, and 4 neutralized pseudovirus particles, including one with a half-maximal inhibitory concentration (IC50) below 10 ng/mL.
Against RSV-A, 7 of 23 tested antibodies demonstrated binding, and 3 neutralized the virus in vitro. Cryo-electron microscopy revealed distinct binding modes for two of the RSV-directed antibodies, with evidence of mutations contributing to their antigen recognition. One antibody, RSV-3301, targeted a less frequently characterized epitope and displayed somatic mutations not commonly observed in the training dataset.
When prompted with the HA sequence from the H5N1 A/Texas/37/2024 strain—unseen during training—MAGE generated antibody sequences that led to 5 of 18 candidates showing binding, all of which neutralized the virus. The HA sequence used shared approximately 91.5% identity with an older H5N1 strain present in the training data, allowing assessment of the model’s performance under limited prior representation.
The study also evaluated predicted developability metrics using the Therapeutic Antibody Profiler (TAP). None of the validated antibodies exceeded high-risk thresholds based on clinical-stage therapeutic benchmarks, although several fell within medium-risk categories for parameters such as CDR length.
MAGE represents a sequence-based, structure-independent approach for generating antigen-specific antibodies. Unlike methods focused on affinity maturation or CDR redesign, this model generates entire variable regions without the need for a template. The authors note that while MAGE was optimized for specificity rather than affinity or neutralization, a portion of the generated antibodies displayed functional neutralization capabilities.
The findings suggest that MAGE may contribute to early-stage antibody discovery workflows, particularly in contexts where structural information or donor-derived antibodies are unavailable. The model’s open-source code and training data have been made publicly accessible to support further research and development.