Mining subgraphs with interesting structural properties from networks (or graphs) is a computationally challenging task. In this paper, we propose two algorithms for enumerating all connected induced subgraphs of a given cardinality from networks (or connected undirected graphs in networks). The first algorithm is a variant of a previous well-known algorithm. The algorithm enumerates all connected induced subgraphs of cardinality $k$ in a bottom-up manner. The data structures that lead to unit time element checking and linear space are presented. Different from previous algorithms that either work in a bottom-up manner or a reverse search manner, an algorithm that enumerates all connected induced subgraphs of cardinality $k$ in a top-down manner is proposed. The correctness and complexity of the top-down algorithm are theoretically analyzed and proven. In the experiments, we evaluate the efficiency of the algorithms using a set of real-world networks from various fields. Experimental results show that the variant bottom-up algorithm outperforms the state-of-the-art algorithms for enumerating connected induced subgraphs of small cardinality, and the top-down algorithm can achieve an order of magnitude speedup over the state-of-the-art algorithms for enumerating connected induced subgraphs of large cardinality.
翻译:从网络(或图)中挖掘具有有趣结构性质的子图是一项计算上具有挑战性的任务。本文提出了两种算法,用于从网络(或网络中的连通无向图)中枚举所有给定基数的连通诱导子图。第一种算法是先前著名算法的一个变体。该算法以自底向上的方式枚举所有基数为$k$的连通诱导子图。本文提出了能够实现单位时间元素检查和线性空间的数据结构。与先前要么采用自底向上方式、要么采用反向搜索方式的算法不同,本文提出了一种以自顶向下方式枚举所有基数为$k$的连通诱导子图的算法。理论上分析并证明了自顶向下算法的正确性与复杂度。在实验中,我们使用来自不同领域的一组真实网络来评估算法的效率。实验结果表明,变体自底向上算法在枚举小基数连通诱导子图时优于最先进的算法,而自顶向下算法在枚举大基数连通诱导子图时,相较于最先进算法可以实现数量级的加速。