Fine-tuning Boltz for Antibody-Antigen Binding Prediction
Author(s)
Kim, Ji Won
DownloadThesis PDF (3.091Mb)
Advisor
Barzilay, Regina
Terms of use
Metadata
Show full item recordAbstract
Accurate prediction of antibody-antigen binding is a central challenge in computational immunology. Its direct implication for therapeutic antibody design and vaccine development has made it one of the most rapidly growing fields. Recent advances in protein language models and structure prediction have provided new tools for modeling, yet these approaches often fall short in capturing the fine-grained features that drive binding specificity in antibody and antigens. This thesis evaluates multiple strategies for improving predictive performance. First, we investigate a custom multiple sequence alignment (MSA) experiment. Standard Boltz-2 training relies on MSAs from broad protein databases, which capture global diversity but under-represent lineage-specific constraints. To address this, we constructed antibody-specific MSAs to test whether restricting the search space to antibody repertoires improves model learning. Unfortunately, gains in downstream binding prediction were limited, suggesting that further work needs to be done in training models for specific databases in the first place. Our second line of investigation focused on fine-tuning Boltz-2, a generative structural foundation model, using curated antibody–antigen data. By leveraging Boltz-2’s internal sequence embeddings, we trained a predictive model for binding affinity. This approach yielded stronger ROC performance compared to baseline models, achieving a validation AUROC of 0.645, demonstrating the advantages of structural generative priors for antibody–antigen binding prediction.
Date issued
2025-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology