Personalized PageRank (PPR) is an extensively studied and applied node proximity measure in graphs. For a pair of nodes $s$ and $t$ on a graph $G=(V,E)$, the PPR value $\pi(s,t)$ is defined as the probability that an $\alpha$-discounted random walk from $s$ terminates at $t$, where the walk terminates with probability $\alpha$ at each step. We study the classic Single-Source PPR query, which asks for PPR approximations from a given source node $s$ to all nodes in the graph. Specifically, we aim to provide approximations with absolute error guarantees, ensuring that the resultant PPR estimates $\hat{\pi}(s,t)$ satisfy $\max_{t\in V}\big|\hat{\pi}(s,t)-\pi(s,t)\big|\le\varepsilon$ for a given error bound $\varepsilon$. We propose an algorithm that achieves this with high probability, with an expected running time of - $\widetilde{O}\big(\sqrt{m}/\varepsilon\big)$ for directed graphs, where $m=|E|$; - $\widetilde{O}\big(\sqrt{d_{\mathrm{max}}}/\varepsilon\big)$ for undirected graphs, where $d_{\mathrm{max}}$ is the maximum node degree in the graph; - $\widetilde{O}\left(n^{\gamma-1/2}/\varepsilon\right)$ for power-law graphs, where $n=|V|$ and $\gamma\in\left(\frac{1}{2},1\right)$ is the extent of the power law. These sublinear bounds improve upon existing results. We also study the case when degree-normalized absolute error guarantees are desired, requiring $\max_{t\in V}\big|\hat{\pi}(s,t)/d(t)-\pi(s,t)/d(t)\big|\le\varepsilon_d$ for a given error bound $\varepsilon_d$, where the graph is undirected and $d(t)$ is the degree of node $t$. We give an algorithm that provides this error guarantee with high probability, achieving an expected complexity of $\widetilde{O}\left(\sqrt{\sum_{t\in V}\pi(s,t)/d(t)}\big/\varepsilon_d\right)$. This improves over the previously known $O(1/\varepsilon_d)$ complexity.
翻译:个性化PageRank(PPR)是图中被广泛研究和应用的节点邻近度度量。对于图$G=(V,E)$上的节点对$s$和$t$,PPR值$\pi(s,t)$定义为以$\alpha$折扣的随机游走从$s$出发终止于$t$的概率,其中每一步游走以概率$\alpha$终止。我们研究经典的单源PPR查询问题,即从给定源节点$s$出发,近似计算图中所有节点的PPR值。具体而言,我们旨在提供具有绝对误差保证的近似解,确保对于给定的误差界$\varepsilon$,得到的PPR估计值$\hat{\pi}(s,t)$满足$\max_{t\in V}\big|\hat{\pi}(s,t)-\pi(s,t)\big|\le\varepsilon$。我们提出一种以高概率实现该目标的算法,其期望运行时间为:
- 对于有向图:$\widetilde{O}\big(\sqrt{m}/\varepsilon\big)$,其中$m=|E|$;
- 对于无向图:$\widetilde{O}\big(\sqrt{d_{\mathrm{max}}}/\varepsilon\big)$,其中$d_{\mathrm{max}}$为图的最大节点度;
- 对于幂律图:$\widetilde{O}\left(n^{\gamma-1/2}/\varepsilon\right)$,其中$n=|V|$且$\gamma\in\left(\frac{1}{2},1\right)$为幂律指数。
这些次线性界改进了现有结果。我们还研究了需要度归一化绝对误差保证的情形,要求对于给定的误差界$\varepsilon_d$,满足$\max_{t\in V}\big|\hat{\pi}(s,t)/d(t)-\pi(s,t)/d(t)\big|\le\varepsilon_d$,此时图为无向图且$d(t)$为节点$t$的度数。我们提出一种以高概率提供该误差保证的算法,其期望复杂度为$\widetilde{O}\left(\sqrt{\sum_{t\in V}\pi(s,t)/d(t)}\big/\varepsilon_d\right)$。这改进了先前已知的$O(1/\varepsilon_d)$复杂度。