Sampling Methods for Fast and Versatile GNN Training
Author(s)
Alkhatib, Obada
DownloadThesis PDF (3.723Mb)
Advisor
Leiserson, Charles E.
Kaler, Timothy
Iliopoulos, Alexandros-Stavros
Terms of use
Metadata
Show full item recordAbstract
Graph neural networks (GNNs) have become a commonly used class of machine learning models that achieve state-of-the-art performance in various applications. A prevalent and effective approach for applying GNNs on large datasets involves mini-batch training with sampled neighborhoods. Numerous sampling algorithms have emerged, some tailored for specific GNN applications. In this thesis, I explore ways to improve the efficiency and expressivity of existing and emerging sampling schemes.
First, I explore system solutions to facilitate the development of fast implementations of different sampling methods. I introduce FlexSample, a system for efficiently incorporating custom sampling algorithms into GNN training. FlexSample leverages the types of performance optimizations found in SALIENT, a state-of-the-art system for fast training of GNNs with node-wise sampling. In experiments with 4 GNN models which use layer-wise and subgraph sampling, FlexSample achieves up to 1.3× speed-up for end-to-end training over PyTorch Geometric with the same sampling code. Furthermore, FlexSample extends SALIENT with highly-optimized C++ implementations of FastGCN and LADIES layer-wise sampling, which achieve 2×–5× speed-up over their respective Python implementations.
Second, I introduce a novel framework for learning neighbor sampling distributions as part of GNN training. Key components of this framework, which I name PertinenceSample, are: (i) a differentiable approximation of node-wise sampling for GNNs; and (ii) a parametrization of node sampling distributions as node- or edge-wise weights of attention-like GNN layers. I present an initial exploration of the potential of PertinenceSample for improving node classification accuracy in the presence of noisy edges. Specifically, in two synthetic experiments where roughly half of a node’s neighbors may have similar features but different labels, I demonstrate that extending a GraphSAGE model with a 2-layer perceptron for learning the PertinenceSample weights can improve classification accuracy from 50%–75% to (nearly) 100%.
Date issued
2022-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology