We revisit Approximate Graph Propagation (AGP), a unified framework which captures various graph propagation tasks, such as PageRank, feature propagation in Graph Neural Networks (GNNs), and graph-based Retrieval-Augmented Generation (RAG). Our work focuses on the settings of dynamic graphs and dynamic parameterized queries, where the underlying graphs evolve over time (updated by edge insertions or deletions) and the input query parameters are specified on the fly to fit application needs. Our first contribution is an interesting observation that the SOTA solution, AGP-Static, can be adapted to support dynamic parameterized queries; however several challenges remain unresolved. Firstly, the query time complexity of AGP-Static is based on an assumption of using an optimal algorithm for subset sampling in its query algorithm. Unfortunately, back to that time, such an algorithm did not exist; without such an optimal algorithm, an extra $O(\log^2 n)$ factor is required in the query complexity, where $n$ is the number of vertices in the graphs. Secondly, AGP-Static performs poorly on dynamic graphs, taking $O(n\log n)$ time to process each update. To address these challenges, we propose a new algorithm, AGP-Static++, which is simpler yet reduces roughly a factor of $O(\log^2 n)$ in the query complexity while preserving the approximation guarantees of AGP-Static. However, AGP-Static++ still requires $O(n)$ time to process each update. To better support dynamic graphs, we further propose AGP-Dynamic, which achieves $O(1)$ amortized time per update, significantly improving the aforementioned $O(n)$ per-update bound, while still preserving the query complexity and approximation guarantees. Last, our comprehensive experiments validate the theoretical improvements: compared to the baselines, our algorithm achieves speedups of up to $177\times$ on update time and $10\times$ on query efficiency.
翻译:本文重新审视近似图传播(AGP)这一统一框架,该框架涵盖了多种图传播任务,例如PageRank、图神经网络(GNN)中的特征传播以及基于图的检索增强生成(RAG)。我们的工作聚焦于动态图与动态参数化查询场景,其中底层图结构随时间演化(通过边插入或删除进行更新),且输入查询参数可根据应用需求动态指定。我们的首要贡献在于一个有趣的观察:现有最优解AGP-Static可被调整以支持动态参数化查询;然而仍有若干挑战尚未解决。首先,AGP-Static的查询时间复杂度基于在其查询算法中使用最优子集采样算法的假设。遗憾的是,当时此类算法并不存在;若缺乏该最优算法,查询复杂度将额外增加$O(\log^2 n)$因子(其中$n$为图中顶点数)。其次,AGP-Static在动态图上表现不佳,处理每次更新需要$O(n\log n)$时间。为应对这些挑战,我们提出新算法AGP-Static++,该算法更简洁且能在保持AGP-Static近似保证的同时,将查询复杂度降低约$O(\log^2 n)$因子。然而,AGP-Static++仍需$O(n)$时间处理每次更新。为更好地支持动态图,我们进一步提出AGP-Dynamic算法,其实现每次更新$O(1)$的摊还时间,显著改进前述$O(n)$的每更新界,同时仍保持查询复杂度与近似保证。最后,我们通过全面实验验证了理论改进:相较于基线方法,我们的算法在更新时间上实现高达177倍的加速,在查询效率上实现10倍的提升。