The biharmonic distance is a fundamental metric on graphs that measures the dissimilarity between two nodes, capturing both local and global structures. It has found applications across various fields, including network centrality, graph clustering, and machine learning. These applications typically require efficient evaluation of pairwise biharmonic distances. However, existing algorithms remain computationally expensive. The state-of-the-art method attains an absolute-error guarantee epsilon_abs with time complexity O(L^5 / epsilon_abs^2), where L denotes the truncation length. In this work, we improve the complexity to O(L^3 / epsilon^2) under a relative-error guarantee epsilon via probe-driven random walks. We provide a relative-error guarantee rather than an absolute-error guarantee because biharmonic distances vary by orders of magnitude across node pairs. Since L is often very large in real-world networks (for example, L >= 10^3), reducing the L-dependence from the fifth to the third power yields substantial gains. Extensive experiments on real-world networks show that our method delivers 10x-1000x per-query speedups at matched relative error over strong baselines and scales to graphs with tens of millions of nodes.
翻译:双调和距离是图上的基本度量,用于衡量两个节点之间的差异,同时捕捉局部和全局结构。该度量在网络中心性、图聚类和机器学习等多个领域均有应用。这些应用通常需要高效计算成对双调和距离。然而,现有算法仍存在计算成本高昂的问题。当前最优方法在绝对误差保证ε_abs下的时间复杂度为O(L^5/ε_abs^2),其中L表示截断长度。本研究通过探针驱动随机游走,将复杂度改进为O(L^3/ε^2),同时提供相对误差保证ε。我们采用相对误差保证而非绝对误差保证,是因为双调和距离在不同节点对间存在数量级差异。由于实际网络中L通常极大(例如L≥10^3),将L的依赖指数从五次方降至三次方可带来显著效率提升。在真实网络上的大量实验表明,在相同相对误差条件下,本方法相比现有强基线实现了单查询10-1000倍的加速,并能扩展到数千万节点规模的图。