TY - JOUR
T1 - Toward equitable major histocompatibility complex binding predictions
AU - Glynn, Eric
AU - Ghersi, Dario
AU - Singh, Mona
N1 - Publisher Copyright: Copyright © 2025 the Author(s).
PY - 2025/2/25
Y1 - 2025/2/25
N2 - Deep learning tools that predict peptide binding by major histocompatibility complex (MHC) proteins play an essential role in developing personalized cancer immunotherapies and vaccines. In order to ensure equitable health outcomes from their application, MHC binding prediction methods must work well across the vast landscape of MHC alleles observed across human populations. Here, we show that there are alarming disparities across individuals in different racial and ethnic groups in how much binding data are associated with their MHC alleles. We introduce a machine learning framework to assess the impact of this data imbalance for predicting binding for any given MHC allele, and apply it to develop a state-of-the-art MHC binding prediction model that additionally provides per-allele performance estimates. We demonstrate that our MHC binding model successfully mitigates much of the data disparities observed across racial groups. To address remaining inequities, we devise an algorithmic strategy for targeted data collection. Our work lays the foundation for further development of equitable MHC binding models for use in personalized immunotherapies.
AB - Deep learning tools that predict peptide binding by major histocompatibility complex (MHC) proteins play an essential role in developing personalized cancer immunotherapies and vaccines. In order to ensure equitable health outcomes from their application, MHC binding prediction methods must work well across the vast landscape of MHC alleles observed across human populations. Here, we show that there are alarming disparities across individuals in different racial and ethnic groups in how much binding data are associated with their MHC alleles. We introduce a machine learning framework to assess the impact of this data imbalance for predicting binding for any given MHC allele, and apply it to develop a state-of-the-art MHC binding prediction model that additionally provides per-allele performance estimates. We demonstrate that our MHC binding model successfully mitigates much of the data disparities observed across racial groups. To address remaining inequities, we devise an algorithmic strategy for targeted data collection. Our work lays the foundation for further development of equitable MHC binding models for use in personalized immunotherapies.
KW - cancer immunotherapies
KW - deep learning
KW - health equity in precision oncology
KW - neoantigen prediction
KW - predicting major histocompatibility complex (MHC) binding
UR - http://www.scopus.com/inward/record.url?scp=85218959391&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85218959391&partnerID=8YFLogxK
U2 - 10.1073/pnas.2405106122
DO - 10.1073/pnas.2405106122
M3 - Article
C2 - 39964728
SN - 0027-8424
VL - 122
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 8
M1 - e2405106122
ER -