Fine-tuning Boltz for Antibody-Antigen Binding
Prediction

Kim, Ji Won

Author(s)

Kim, Ji Won

DownloadThesis PDF (3.091Mb)

Advisor

Barzilay, Regina

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Accurate prediction of antibody-antigen binding is a central challenge in computational immunology. Its direct implication for therapeutic antibody design and vaccine development has made it one of the most rapidly growing fields. Recent advances in protein language models and structure prediction have provided new tools for modeling, yet these approaches often fall short in capturing the fine-grained features that drive binding specificity in antibody and antigens. This thesis evaluates multiple strategies for improving predictive performance. First, we investigate a custom multiple sequence alignment (MSA) experiment. Standard Boltz-2 training relies on MSAs from broad protein databases, which capture global diversity but under-represent lineage-specific constraints. To address this, we constructed antibody-specific MSAs to test whether restricting the search space to antibody repertoires improves model learning. Unfortunately, gains in downstream binding prediction were limited, suggesting that further work needs to be done in training models for specific databases in the first place. Our second line of investigation focused on fine-tuning Boltz-2, a generative structural foundation model, using curated antibody–antigen data. By leveraging Boltz-2’s internal sequence embeddings, we trained a predictive model for binding affinity. This approach yielded stronger ROC performance compared to baseline models, achieving a validation AUROC of 0.645, demonstrating the advantages of structural generative priors for antibody–antigen binding prediction.

Date issued

2025-09

URI

https://hdl.handle.net/1721.1/164655

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses