A hypergraph is a generalization of a graph that arises naturally when attribute-sharing among entities is considered. Compared to graphs, hypergraphs have the distinct advantage that they contain explicit communities and are more convenient to manipulate. An open problem in hypergraph research is how to accurately and efficiently calculate node distances on hypergraphs. Estimating node distances enables us to find a node's nearest neighbors, which has important applications in such areas as recommender system, targeted ads, etc. In this paper, we propose using expected hitting times of random walks to compute hypergraph node distances. We note that simple random walks (SRW) cannot accurately compute node distances on highly complex real-world hypergraphs, which motivates us to introduce frustrated random walks (FRW) for this task. We further benchmark our method against DeepWalk, and show that while the latter can achieve comparable results, FRW has a distinct computational advantage in cases where the number of targets is fairly small. For such cases, we show that FRW runs in significantly shorter time than DeepWalk. Finally, we analyze the time complexity of our method, and show that for large and sparse hypergraphs, the complexity is approximately linear, rendering it superior to the DeepWalk alternative.
翻译:超图是图的一种推广形式,当考虑实体间的属性共享时会自然产生。与普通图相比,超图具有包含显式社区且更便于操作的独特优势。超图研究中的一个开放问题是如何准确高效地计算超图上的节点距离。估计节点距离使我们能够找到节点的最近邻,这在推荐系统、定向广告等领域具有重要应用。本文提出使用随机游走的期望命中时间来计算超图节点距离。我们注意到简单随机游走在高度复杂的现实世界超图上无法准确计算节点距离,这促使我们引入受挫随机游走来完成此任务。我们进一步将本方法与DeepWalk进行基准测试,结果表明虽然后者能达到可比的结果,但在目标节点数量较少的情况下,FRW具有显著的计算优势。对于此类情况,我们证明FRW的运行时间明显短于DeepWalk。最后,我们分析了本方法的时间复杂度,并证明对于大型稀疏超图,其复杂度近似线性,这使其优于DeepWalk方案。