Show simple item record

dc.contributor.authorLin, Junhong
dc.contributor.authorGuo, Xiaojie
dc.contributor.authorZhang, Shuaicheng
dc.contributor.authorZhu, Yada
dc.contributor.authorShun, Julian
dc.date.accessioned2025-09-09T19:51:25Z
dc.date.available2025-09-09T19:51:25Z
dc.date.issued2025-08-03
dc.identifier.isbn979-8-4007-1454-2
dc.identifier.urihttps://hdl.handle.net/1721.1/162620
dc.descriptionKDD ’25, Toronto, ON, Canadaen_US
dc.description.abstractGraph mining has become crucial in fields such as social science, finance, and cybersecurity. Many large-scale real-world networks exhibit both heterogeneity, where multiple node and edge types exist in the graph, and heterophily, where connected nodes may have dissimilar labels and attributes. However, existing benchmarks primarily focus on either heterophilic homogeneous graphs or homophilic heterogeneous graphs, leaving a significant gap in understanding how models perform on graphs with both heterogeneity and heterophily. To bridge this gap, we introduce H2GB, a large-scale node-classification graph benchmark that brings together the complexities of both the heterophily and heterogeneity properties of real-world graphs. H2GB encompasses 9 real-world datasets spanning 5 diverse domains, 28 baseline models, and a unified benchmarking library with a standardized data loader, evaluator, unified modeling framework, and an extensible framework for reproducibility. We establish a standardized workflow supporting both model selection and development, enabling researchers to easily benchmark graph learning methods. Extensive experiments across 28 baselines reveal that current methods struggle with heterophilic and heterogeneous graphs, underscoring the need for improved approaches. Finally, we present a new variant of the model, H2G-former, developed following our standardized workflow, that excels at this challenging benchmark. Both the benchmark and the framework are publicly available at Github and PyPI, with documentation hosted at https://junhongmit.github.io/H2GB.en_US
dc.publisherACM|Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2en_US
dc.relation.isversionofhttps://doi.org/10.1145/3711896.3737421en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleWhen Heterophily Meets Heterogeneity: Challenges and a New Large-Scale Graph Benchmarken_US
dc.typeArticleen_US
dc.identifier.citationJunhong Lin, Xiaojie Guo, Shuaicheng Zhang, Yada Zhu, and Julian Shun. 2025. When Heterophily Meets Heterogeneity: Challenges and a New Large-Scale Graph Benchmark. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (KDD '25). Association for Computing Machinery, New York, NY, USA, 5607–5618.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.mitlicensePUBLISHER_POLICY
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2025-09-01T07:51:39Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2025-09-01T07:51:40Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record