A hypergraph is a generalization of a graph that arises naturally when attribute-sharing among entities is considered. Compared to graphs, hypergraphs have the distinct advantage that they contain explicit communities and are more convenient to manipulate. An open problem in hypergraph research is how to accurately and efficiently calculate node distances on hypergraphs. Estimating node distances enables us to find a node's nearest neighbors, which has important applications in such areas as recommender system, targeted advertising, etc. In this paper, we propose using expected hitting times of random walks to compute hypergraph node distances. We note that simple random walks (SRW) cannot accurately compute node distances on highly complex real-world hypergraphs, which motivates us to introduce frustrated random walks (FRW) for this task. We further benchmark our method against DeepWalk, and show that while the latter can achieve comparable results, FRW has a distinct computational advantage in cases where the number of targets is fairly small. For such cases, we show that FRW runs in significantly shorter time than DeepWalk. Finally, we analyze the time complexity of our method, and show that for large and sparse hypergraphs, the complexity is approximately linear, rendering it superior to the DeepWalk alternative.
翻译:超图是图的一种推广,在考虑实体间属性共享时会自然出现。与图相比,超图具有包含显式社区且更便于操作的独特优势。超图研究中的一个开放问题是如何准确高效地计算超图上的节点距离。估计节点距离使我们能够找到一个节点的最近邻,这在推荐系统、定向广告等领域具有重要应用。在本文中,我们提出使用随机游走的期望命中时间来计算超图节点距离。我们注意到,简单随机游走在高度复杂的现实世界超图上无法准确计算节点距离,这促使我们为此任务引入受挫随机游走。我们进一步将我们的方法与DeepWalk进行基准测试,结果表明,虽然后者可以达到相当的结果,但在目标数量相当小的情况下,FRW具有显著的计算优势。对于此类情况,我们证明FRW的运行时间明显短于DeepWalk。最后,我们分析了我们方法的时间复杂度,并表明对于大型稀疏超图,其复杂度近似线性,使其优于DeepWalk方案。