Show simple item record

dc.contributor.authorHong, Seokyoung
dc.contributor.authorChattaraj, Krishna Gopal
dc.contributor.authorGuo, Jing
dc.contributor.authorTrout, Bernhardt L
dc.contributor.authorBraatz, Richard D
dc.date.accessioned2026-02-20T23:33:13Z
dc.date.available2026-02-20T23:33:13Z
dc.date.issued2025-02-04
dc.identifier.urihttps://hdl.handle.net/1721.1/164930
dc.description.abstractMotivation: The accurate prediction of O-GlcNAcylation sites is crucial for understanding disease mechanisms and developing effective treatments. Previous machine learning (ML) models primarily relied on primary or secondary protein structural and related properties, which have limitations in capturing the spatial interactions of neighboring amino acids. This study introduces local environmental features as a novel approach that incorporates three-dimensional spatial information, significantly improving model performance by considering the spatial context around the target site. Additionally, we utilize sparse recurrent neural networks to effectively capture sequential nature of the proteins and to identify key factors influencing O-GlcNAcylation as an explainable ML model. Results: Our findings demonstrate the effectiveness of our proposed features with the model achieving an F1 score of 28.3%, as well as feature selection capability with the model using only the top 20% of features achieving the highest F1 score of 32.02%, a 1.4-fold improvement over existing PTM models. Statistical analysis of the top 20 features confirmed their consistency with literature. This method not only boosts prediction accuracy but also paves the way for further research in understanding and targeting O-GlcNAcylation. Availability and implementation: The entire code, data, features used in this study are available in the GitHub repository: https://github.com/ pseokyoung/o-glcnac-en_US
dc.language.isoen
dc.publisherOxford University Pressen_US
dc.relation.isversionof10.1093/bioinformatics/btaf034en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceOxford University Pressen_US
dc.titleEnhanced O-glycosylation site prediction using explainable machine learning technique with spatial local environmenten_US
dc.typeArticleen_US
dc.identifier.citationSeokyoung Hong, Krishna Gopal Chattaraj, Jing Guo, Bernhardt L Trout, Richard D Braatz, Enhanced O-glycosylation site prediction using explainable machine learning technique with spatial local environment, Bioinformatics, Volume 41, Issue 2, February 2025, btaf034.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Chemical Engineeringen_US
dc.relation.journalBioinformaticsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2026-02-20T23:19:25Z
dspace.orderedauthorsHong, S; Chattaraj, KG; Guo, J; Trout, BL; Braatz, RDen_US
dspace.date.submission2026-02-20T23:19:39Z
mit.journal.volume41en_US
mit.journal.issue2en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record