RIPPLE++: An Incremental Framework for Efficient GNN Inference on Evolving Graphs

from arxiv, Extended full-length version of paper that appeared at ICDCS 2025: "RIPPLE: Scalable Incremental GNN Inferencing on Large Streaming Graphs", Pranjal Naman and Yogesh Simmhan, in International Conference on Distributed Computing Systems (ICDCS), 2025. DOI: https://doi.org/10.1109/icdcs63083.2025.00088

Real-world graphs are dynamic, with frequent updates to their structure and features due to evolving vertex and edge properties. These continual changes pose significant challenges for efficient inference in graph neural networks (GNNs). Existing vertex-wise and layer-wise inference approaches are ill-suited for dynamic graphs, as they incur redundant computations, large neighborhood traversals, and high communication costs, especially in distributed settings. Additionally, while sampling-based approaches can be adopted to approximate final layer embeddings, these are often not preferred in critical applications due to their non-determinism. These limitations hinder low-latency inference required in real-time applications. To address this, we propose RIPPLE++, a framework for streaming GNN inference that efficiently and accurately updates embeddings in response to changes in the graph structure or features. RIPPLE++ introduces a generalized incremental programming model that captures the semantics of GNN aggregation functions and incrementally propagates updates to affected neighborhoods. RIPPLE++ accommodates all common graph updates, including vertex/edge addition/deletions and vertex feature updates. RIPPLE++ supports both single-machine and distributed deployments. On a single machine, it achieves up to $56$K updates/sec on sparse graphs like Arxiv ($169$K vertices, $1.2$M edges), and about $7.6$K updates/sec on denser graphs like Products ($2.5$M vertices, $123.7$M edges), with latencies of $0.06$--$960$ms, and outperforming state-of-the-art baselines by $2.2$--$24\times$ on throughput. In distributed settings, RIPPLE++ offers up to $\approx25\times$ higher throughput and $20\times$ lower communication costs compared to recomputing baselines.

翻译：现实世界中的图具有动态性，其结构与特征会因顶点和边属性的演化而频繁更新。这些持续变化为图神经网络（GNN）的高效推理带来了重大挑战。现有的逐顶点与逐层推理方法不适用于动态图，因为它们会产生冗余计算、大规模邻域遍历和高昂的通信开销，尤其在分布式环境中。此外，虽然可采用基于采样的方法来近似最终层嵌入，但由于其非确定性，这些方法在关键应用中通常不被优先考虑。这些限制阻碍了实时应用所需的低延迟推理。为此，我们提出RIPPLE++，一个用于流式GNN推理的框架，能够高效且准确地响应图结构或特征的变化而更新嵌入。RIPPLE++引入了一种广义增量编程模型，该模型捕获了GNN聚合函数的语义，并将更新增量传播到受影响的邻域。RIPPLE++支持所有常见的图更新操作，包括顶点/边的添加/删除以及顶点特征更新。RIPPLE++同时支持单机与分布式部署。在单机环境下，其在稀疏图（如Arxiv，包含169K个顶点、1.2M条边）上最高可达$56$K次更新/秒，在稠密图（如Products，包含2.5M个顶点、123.7M条边）上约为$7.6$K次更新/秒，延迟在$0.06$--$960$ms之间，吞吐量比现有最优基线高出$2.2$--$24$倍。在分布式环境中，与重新计算基线相比，RIPPLE++提供了高达约$25$倍的吞吐量提升和$20$倍的通信成本降低。