MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Enhanced O-glycosylation site prediction using explainable machine learning technique with spatial local environment

Author(s)
Hong, Seokyoung; Chattaraj, Krishna Gopal; Guo, Jing; Trout, Bernhardt L; Braatz, Richard D
Thumbnail
DownloadPublished version (3.520Mb)
Publisher with Creative Commons License

Publisher with Creative Commons License

Creative Commons Attribution

Terms of use
Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/
Metadata
Show full item record
Abstract
Motivation: The accurate prediction of O-GlcNAcylation sites is crucial for understanding disease mechanisms and developing effective treatments. Previous machine learning (ML) models primarily relied on primary or secondary protein structural and related properties, which have limitations in capturing the spatial interactions of neighboring amino acids. This study introduces local environmental features as a novel approach that incorporates three-dimensional spatial information, significantly improving model performance by considering the spatial context around the target site. Additionally, we utilize sparse recurrent neural networks to effectively capture sequential nature of the proteins and to identify key factors influencing O-GlcNAcylation as an explainable ML model. Results: Our findings demonstrate the effectiveness of our proposed features with the model achieving an F1 score of 28.3%, as well as feature selection capability with the model using only the top 20% of features achieving the highest F1 score of 32.02%, a 1.4-fold improvement over existing PTM models. Statistical analysis of the top 20 features confirmed their consistency with literature. This method not only boosts prediction accuracy but also paves the way for further research in understanding and targeting O-GlcNAcylation. Availability and implementation: The entire code, data, features used in this study are available in the GitHub repository: https://github.com/ pseokyoung/o-glcnac-
Date issued
2025-02-04
URI
https://hdl.handle.net/1721.1/164930
Department
Massachusetts Institute of Technology. Department of Chemical Engineering
Journal
Bioinformatics
Publisher
Oxford University Press
Citation
Seokyoung Hong, Krishna Gopal Chattaraj, Jing Guo, Bernhardt L Trout, Richard D Braatz, Enhanced O-glycosylation site prediction using explainable machine learning technique with spatial local environment, Bioinformatics, Volume 41, Issue 2, February 2025, btaf034.
Version: Final published version

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.