Classic Graph Neural Network (GNN) inference approaches, designed for static graphs, are ill-suited for streaming graphs that evolve with time. The dynamism intrinsic to streaming graphs necessitates constant updates, posing unique challenges to acceleration on GPU. We address these challenges based on two key insights: (1) Inside the $k$-hop neighborhood, a significant fraction of the nodes is not impacted by the modified edges when the model uses min or max as aggregation function; (2) When the model weights remain static while the graph structure changes, node embeddings can incrementally evolve over time by computing only the impacted part of the neighborhood. With these insights, we propose a novel method, InkStream, designed for real-time inference with minimal memory access and computation, while ensuring an identical output to conventional methods. InkStream operates on the principle of propagating and fetching data only when necessary. It uses an event-based system to control inter-layer effect propagation and intra-layer incremental updates of node embedding. InkStream is highly extensible and easily configurable by allowing users to create and process customized events. We showcase that less than 10 lines of additional user code are needed to support popular GNN models such as GCN, GraphSAGE, and GIN. Our experiments with three GNN models on four large graphs demonstrate that InkStream accelerates by 2.5-427$\times$ on a CPU cluster and 2.4-343$\times$ on two different GPU clusters while producing identical outputs as GNN model inference on the latest graph snapshot.
翻译:摘要:经典的图神经网络推理方法针对静态图设计,难以适应随时间演化的流式图。流式图固有的动态性需要持续更新,这给GPU上的加速带来了独特挑战。我们基于两个关键洞察应对这些挑战:(1)在k跳邻域内,当模型使用最小或最大聚合函数时,大部分节点不受修改边的影响;(2)当模型权重保持静态而图结构变化时,仅通过计算邻域中受影响的部分,节点嵌入可随时间逐步演化。基于这些洞察,我们提出了一种新方法InkStream,旨在以最小内存访问和计算实现实时推理,同时确保与常规方法产生相同的输出。InkStream基于仅在必要时传播和获取数据的原理运行,采用事件系统控制层间效应传播和层内节点嵌入的增量更新。它高度可扩展且易于配置,允许用户创建和处理自定义事件。我们证明,仅需不到10行额外用户代码即可支持GCN、GraphSAGE和GIN等流行GNN模型。我们在四个大型图上对三种GNN模型进行的实验表明,InkStream在CPU集群上实现了2.5-427倍的加速,在两个不同的GPU集群上实现了2.4-343倍的加速,同时产生与基于最新图快照的GNN模型推理相同的输出。