Approximating Single-Source Personalized PageRank with Absolute Error Guarantees

Personalized PageRank (PPR) is an extensively studied and applied node proximity measure in graphs. For a pair of nodes $s$ and $t$ on a graph $G=(V,E)$, the PPR value $\pi(s,t)$ is defined as the probability that an $\alpha$-discounted random walk from $s$ terminates at $t$, where the walk terminates with probability $\alpha$ at each step. We study the classic Single-Source PPR query, which asks for PPR approximations from a given source node $s$ to all nodes in the graph. Specifically, we aim to provide approximations with absolute error guarantees, ensuring that the resultant PPR estimates $\hat{\pi}(s,t)$ satisfy $\max_{t\in V}\big|\hat{\pi}(s,t)-\pi(s,t)\big|\le\varepsilon$ for a given error bound $\varepsilon$. We propose an algorithm that achieves this with high probability, with an expected running time of - $\widetilde{O}\big(\sqrt{m}/\varepsilon\big)$ for directed graphs, where $m=|E|$; - $\widetilde{O}\big(\sqrt{d_{\mathrm{max}}}/\varepsilon\big)$ for undirected graphs, where $d_{\mathrm{max}}$ is the maximum node degree in the graph; - $\widetilde{O}\left(n^{\gamma-1/2}/\varepsilon\right)$ for power-law graphs, where $n=|V|$ and $\gamma\in\left(\frac{1}{2},1\right)$ is the extent of the power law. These sublinear bounds improve upon existing results. We also study the case when degree-normalized absolute error guarantees are desired, requiring $\max_{t\in V}\big|\hat{\pi}(s,t)/d(t)-\pi(s,t)/d(t)\big|\le\varepsilon_d$ for a given error bound $\varepsilon_d$, where the graph is undirected and $d(t)$ is the degree of node $t$. We give an algorithm that provides this error guarantee with high probability, achieving an expected complexity of $\widetilde{O}\left(\sqrt{\sum_{t\in V}\pi(s,t)/d(t)}\big/\varepsilon_d\right)$. This improves over the previously known $O(1/\varepsilon_d)$ complexity.

翻译：个性化PageRank（PPR）是图中被广泛研究和应用的节点邻近度度量。对于图$G=(V,E)$上的节点对$s$和$t$，PPR值$\pi(s,t)$定义为以$\alpha$折扣的随机游走从$s$出发终止于$t$的概率，其中每一步游走以概率$\alpha$终止。我们研究经典的单源PPR查询问题，即从给定源节点$s$出发，近似计算图中所有节点的PPR值。具体而言，我们旨在提供具有绝对误差保证的近似解，确保对于给定的误差界$\varepsilon$，得到的PPR估计值$\hat{\pi}(s,t)$满足$\max_{t\in V}\big|\hat{\pi}(s,t)-\pi(s,t)\big|\le\varepsilon$。我们提出一种以高概率实现该目标的算法，其期望运行时间为： - 对于有向图：$\widetilde{O}\big(\sqrt{m}/\varepsilon\big)$，其中$m=|E|$； - 对于无向图：$\widetilde{O}\big(\sqrt{d_{\mathrm{max}}}/\varepsilon\big)$，其中$d_{\mathrm{max}}$为图的最大节点度； - 对于幂律图：$\widetilde{O}\left(n^{\gamma-1/2}/\varepsilon\right)$，其中$n=|V|$且$\gamma\in\left(\frac{1}{2},1\right)$为幂律指数。这些次线性界改进了现有结果。我们还研究了需要度归一化绝对误差保证的情形，要求对于给定的误差界$\varepsilon_d$，满足$\max_{t\in V}\big|\hat{\pi}(s,t)/d(t)-\pi(s,t)/d(t)\big|\le\varepsilon_d$，此时图为无向图且$d(t)$为节点$t$的度数。我们提出一种以高概率提供该误差保证的算法，其期望复杂度为$\widetilde{O}\left(\sqrt{\sum_{t\in V}\pi(s,t)/d(t)}\big/\varepsilon_d\right)$。这改进了先前已知的$O(1/\varepsilon_d)$复杂度。

相关内容

PageRank

关注 210

PageRank，网页排名，又称网页级别、Google左侧排名或佩奇排名，是一种由[1] 根据网页之间相互的超链接计算的技术，而作为网页排名的要素之一，以Google公司创办人拉里·佩奇（Larry Page）之姓来命名。Google用它来体现网页的相关性和重要性，在搜索引擎优化操作中是经常被用来评估网页优化的成效因素之一。Google的创始人拉里·佩奇和谢尔盖·布林于1998年在斯坦福大学发明了这项技术。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日