Maximum Defective Clique Computation: Improved Time Complexities and Practical Performance

The concept of $k$-defective clique, a relaxation of clique by allowing up-to $k$ missing edges, has been receiving increasing interests recently. Although the problem of finding the maximum $k$-defective clique is NP-hard, several practical algorithms have been recently proposed in the literature, with kDC being the state of the art. kDC not only runs the fastest in practice, but also achieves the best time complexity. Specifically, it runs in $O^*(\gamma_k^n)$ time when ignoring polynomial factors; here, $\gamma_k$ is a constant that is smaller than two and only depends on $k$, and $n$ is the number of vertices in the input graph $G$. In this paper, we propose the kDC-Two algorithm to improve the time complexity as well as practical performance. kDC-Two runs in $O^*( (\alpha\Delta)^{k+2} \gamma_{k-1}^\alpha)$ time when the maximum $k$-defective clique size $\omega_k(G)$ is at least $k+2$, and in $O^*(\gamma_{k-1}^n)$ time otherwise, where $\alpha$ and $\Delta$ are the degeneracy and maximum degree of $G$, respectively. In addition, with slight modification, kDC-Two also runs in $O^*( (\alpha\Delta)^{k+2} (k+1)^{\alpha+k+1-\omega_k(G)})$ time by using the degeneracy gap $\alpha+k+1-\omega_k(G)$ parameterization; this is better than $O^*( (\alpha\Delta)^{k+2}\gamma_{k-1}^\alpha)$ when $\omega_k(G)$ is close to the degeneracy-based upper bound $\alpha+k+1$. Finally, to further improve the practical performance, we propose a new degree-sequence-based reduction rule that can be efficiently applied, and theoretically demonstrate its effectiveness compared with those proposed in the literature. Extensive empirical studies on three benchmark graph collections show that our algorithm outperforms the existing fastest algorithm by several orders of magnitude.

翻译：$k$-缺陷团（$k$-defective clique）作为团（clique）的松弛概念，允许至多$k$条缺失边，近年来受到越来越多的关注。尽管寻找最大$k$-缺陷团的问题属于NP-hard问题，但文献中已提出了几种实用算法，其中kDC是目前最先进的算法。kDC不仅在实践中最快，而且实现了最优时间复杂度。具体而言，在忽略多项式因子的情况下，其运行时间为$O^*(\gamma_k^n)$；这里$\gamma_k$是一个小于2且仅依赖于$k$的常数，$n$是输入图$G$的顶点数。本文提出kDC-Two算法以进一步改进时间复杂度与实际性能。当最大$k$-缺陷团大小$\omega_k(G)$至少为$k+2$时，kDC-Two的运行时间为$O^*((\alpha\Delta)^{k+2} \gamma_{k-1}^\alpha)$；否则为$O^*(\gamma_{k-1}^n)$；其中$\alpha$和$\Delta$分别表示图$G$的退化度（degeneracy）和最大度（maximum degree）。此外，通过轻微修改，利用退化度间隙（degeneracy gap）参数化$\alpha+k+1-\omega_k(G)$，kDC-Two还可实现$O^*((\alpha\Delta)^{k+2} (k+1)^{\alpha+k+1-\omega_k(G)})$的运行时间；当$\omega_k(G)$接近基于退化度的上界$\alpha+k+1$时，该复杂度优于$O^*((\alpha\Delta)^{k+2}\gamma_{k-1}^\alpha)$。最后，为进一步提升实际性能，我们提出一种基于度序列（degree sequence）的新剪枝规则，该规则可高效应用，并从理论上验证了其相对于文献中已有剪枝规则的有效性。在三个基准图集合上进行的广泛实证研究表明，我们的算法较现有最快算法的性能提升达数个数量级。