A Fast Parallel Approach for Neighborhood-based Link Prediction by Disregarding Large Hubs

Link prediction can help rectify inaccuracies in various graph algorithms, stemming from unaccounted-for or overlooked links within networks. However, many existing works use a baseline approach, which incurs unnecessary computational costs due to its high time complexity. Further, many studies focus on smaller graphs, which can lead to misleading conclusions. Here, we study the prediction of links using neighborhood-based similarity measures on large graphs. In particular, we improve upon the baseline approach (IBase), and propose a heuristic approach that additionally disregards large hubs (DLH), based on the idea that high-degree nodes contribute little similarity among their neighbors. On a server equipped with dual 16-core Intel Xeon Gold 6226R processors, DLH is on average 1019x faster than IBase, especially on web graphs and social networks, while maintaining similar prediction accuracy. Notably, DLH achieves a link prediction rate of 38.1M edges/s and improves performance by 1.6x for every doubling of threads.

翻译：链接预测有助于纠正各种图算法中因网络中未考虑或遗漏链接而产生的误差。然而，现有研究多采用基线方法，其高时间复杂度会导致不必要的计算开销。此外，许多研究聚焦于较小规模的图，这可能得出具有误导性的结论。本文研究在大型图上使用基于邻域的相似性度量进行链接预测。具体而言，我们在基线方法（IBase）基础上进行改进，提出一种额外忽略大型枢纽节点的启发式方法（DLH），其核心思想是高度数节点在其邻居间贡献的相似度有限。在配备双路16核Intel Xeon Gold 6226R处理器的服务器上，DLH平均比IBase快1019倍（尤其在网络图和社交网络中），同时保持相近的预测精度。值得注意的是，DLH实现了每秒3810万条边的链接预测速率，且线程数每增加一倍，性能提升1.6倍。

相关内容

链路预测

关注 14

网络中的链路预测(Link Prediction)是指如何通过已知的网络节点以及网络结构等信息预测网络中尚未产生连边的两个节点之间产生链接的可能性。这种预测既包含了对未知链接（exist yet unknown links）的预测也包含了对未来链接（future links）的预测。该问题的研究在理论和应用两个方面都具有重要的意义和价值。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日