We study Personalized PageRank (PPR), where for nodes $s,t$ in a graph $G$, $\pi(s,t)$ is the probability that an $\alpha$-decay random walk from $s$ ends at $t$. Two key queries are: Single-Source PPR (SSPPR), computing $\pi(s,\cdot)$ for fixed $s$, and Single-Target PPR (STPPR), computing $\pi(\cdot,t)$ for fixed $t$. SSPPR is studied under absolute error (SSPPR-A), requiring $|\hat{\pi}(s,t)-\pi(s,t)|\le \epsilon$, and relative error (SSPPR-R), requiring $|\hat{\pi}(s,t)-\pi(s,t)|\le c\pi(s,t)$ for $t$ with $\pi(s,t)\ge \delta$; STPPR adopts the same relative criterion. These queries support web search, recommendation, sparsification, and graph neural networks. The best known upper bounds are $O(\min(\tfrac{\log(1/\epsilon)}{\epsilon^{2}},\tfrac{\sqrt{m\log n}}{\epsilon},m\log\tfrac{1}{\epsilon}))$ for SSPPR-A and $O(\min(\tfrac{\log(1/\delta)}{\delta},\sqrt{\tfrac{m\log n}{\delta}},m\log\tfrac{\log n}{\delta m}))$ for SSPPR-R, while lower bounds remain $\Omega(\min(n,1/\epsilon))$, $\Omega(\min(m,1/\delta))$, and $\Omega(\min(n,1/\delta))$, leaving large gaps. We close these gaps by (i) presenting a Monte Carlo algorithm that tightens the SSPPR-A upper bound to $O(1/\epsilon^{2})$, and (ii) proving, via an arc-centric construction, lower bounds $\Omega(\min(m,\tfrac{\log(1/\delta)}{\delta}))$ for SSPPR-R, $\Omega(\min(m,\tfrac{1}{\epsilon^{2}}))$ (and intermediate $\Omega(\min(m,\tfrac{\log(1/\epsilon)}{\epsilon}))$) for SSPPR-A, and $\Omega(\min(m,\tfrac{n}{\delta}\log n))$ for STPPR. For practical settings ($\delta=\Theta(1/n)$, $\epsilon=\Theta(n^{-1/2})$, $m\in\Omega(n\log n)$) these bounds meet the best known upper bounds, establishing the optimality of Monte Carlo and FORA for SSPPR-R, our algorithm for SSPPR-A, and RBS for STPPR, and yielding a near-complete complexity landscape for PPR queries.
翻译:我们研究个性化PageRank(PPR),其中对于图$G$中的节点$s,t$,$\pi(s,t)$表示从$s$出发的$\alpha$-衰减随机游走终止于$t$的概率。两个关键查询是:单源PPR(SSPPR),计算固定$s$对应的$\pi(s,\cdot)$;以及单目标PPR(STPPR),计算固定$t$对应的$\pi(\cdot,t)$。SSPPR在绝对误差(SSPPR-A)条件下被研究,要求$|\hat{\pi}(s,t)-\pi(s,t)|\le \epsilon$;在相对误差(SSPPR-R)条件下被研究,要求对于满足$\pi(s,t)\ge \delta$的$t$,有$|\hat{\pi}(s,t)-\pi(s,t)|\le c\pi(s,t)$;STPPR采用相同的相对误差标准。这些查询支持网络搜索、推荐系统、图稀疏化以及图神经网络。已知最佳上界为:SSPPR-A的$O(\min(\tfrac{\log(1/\epsilon)}{\epsilon^{2}},\tfrac{\sqrt{m\log n}}{\epsilon},m\log\tfrac{1}{\epsilon}))$,SSPPR-R的$O(\min(\tfrac{\log(1/\delta)}{\delta},\sqrt{\tfrac{m\log n}{\delta}},m\log\tfrac{\log n}{\delta m}))$,而下界仍为$\Omega(\min(n,1/\epsilon))$、$\Omega(\min(m,1/\delta))$和$\Omega(\min(n,1/\delta))$,存在较大差距。我们通过以下方式弥合了这些差距:(i)提出一种蒙特卡洛算法,将SSPPR-A的上界收紧至$O(1/\epsilon^{2})$;(ii)通过基于弧的构造,证明SSPPR-R的下界为$\Omega(\min(m,\tfrac{\log(1/\delta)}{\delta}))$,SSPPR-A的下界为$\Omega(\min(m,\tfrac{1}{\epsilon^{2}}))$(以及中间的$\Omega(\min(m,\tfrac{\log(1/\epsilon)}{\epsilon}))$),STPPR的下界为$\Omega(\min(m,\tfrac{n}{\delta}\log n))$。在实际设置($\delta=\Theta(1/n)$,$\epsilon=\Theta(n^{-1/2})$,$m\in\Omega(n\log n)$)下,这些下界与已知最佳上界相匹配,从而确立了蒙特卡洛方法和FORA对于SSPPR-R、我们的算法对于SSPPR-A、以及RBS对于STPPR的最优性,并为PPR查询构建了近乎完整的计算复杂度图景。