MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

High Precision Binary Trait Association on PhylogeneticTrees

Author(s)
Balogun, Ishaq O.
Thumbnail
DownloadThesis PDF (2.622Mb)
Advisor
Lieberman, Tami D.
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Understanding how genetic variation drives microbial phenotypes is fundamental to advancing microbiology, particularly in pathogenicity, drug resistance, and host adaptation. Traditional genome-wide association study (GWAS) methods fail to account for shared evolutionary history, confounding association analyses. Microbial GWAS approaches emerged to address this, but modern methods often lack the statistical power to detect associations while controlling false discoveries, and face computational limits at scale. Here, we present SimPhyNI (Simulation-based Phylogenetic iNteraction Inference), a computational framework for detecting binary trait-trait associations in microbial populations. SimPhyNI uses stochastic simulations of trait evolution on phylogenetic trees to detect positive and negative associations with high precision and recall. Benchmarking on large synthetic datasets, SimPhyNI achieved a precision-recall AUC (PR AUC) of 0.987 and 0.975 for positive and negative interactions, respectively, indicating near-perfect discrimination of true from neutral associations. Competing methods showed substantially lower performance, especially for negative associations. We further applied SimPhyNI to empirical datasets, recovering known biology and generating plausible hypotheses for novel mechanisms. Though tested here on binary traits, SimPhyNI’s design supports future extension to multi-state and continuous traits using generalized models. Its high recall also makes it well-suited for constructing gene interaction networks and identifying co-evolving trait modules. By combining evolutionary modeling with scalable statistics, SimPhyNI advances our ability to uncover the genetic interactions that drive microbial function, ecology, and disease.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162565
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.