Network motifs are recurrent, small-scale patterns of interactions observed frequently in a system. They shed light on the interplay between the topology and the dynamics of complex networks across various domains. In this work, we focus on the problem of counting occurrences of small sub-hypergraph patterns in very large hypergraphs, where higher-order interactions connect arbitrary numbers of system units. We show how directly exploiting higher-order structures speeds up the counting process compared to traditional data mining techniques for exact motif discovery. Moreover, with hyperedge sampling, performance is further improved at the cost of small errors in the estimation of motif frequency. We evaluate our method on several real-world datasets describing face-to-face interactions, co-authorship and human communication. We show that our approximated algorithm allows us to extract higher-order motifs faster and on a larger scale, beyond the computational limits of an exact approach.
翻译:网络模体是系统中频繁出现的、小规模且反复出现的交互模式,它们揭示了不同领域复杂网络中拓扑结构与动力学之间的相互作用。本研究聚焦于在规模极大的超图中计数小型子超图模式出现次数的问题,其中高阶交互可连接任意数量的系统单元。我们证明了与传统数据挖掘技术相比,直接利用高阶结构可加速精确模体发现的计数过程。此外,通过超边采样技术,在模体频率估计中引入微小误差的前提下,性能得到进一步提升。我们在描述面对面交互、合著关系及人类通信的多个真实数据集上评估了该方法,结果显示,近似算法使我们能够在超越精确方法计算极限的更大规模上,更快地提取高阶模体。