Efficient Maximum $k$-Defective Clique Computation with Improved Time Complexity

$k$-defective cliques relax cliques by allowing up-to $k$ missing edges from being a complete graph. This relaxation enables us to find larger near-cliques and has applications in link prediction, cluster detection, social network analysis and transportation science. The problem of finding the largest $k$-defective clique has been recently studied with several algorithms being proposed in the literature. However, the currently fastest algorithm KDBB does not improve its time complexity from being the trivial $O(2^n)$, and also, KDBB's practical performance is still not satisfactory. In this paper, we advance the state of the art for exact maximum $k$-defective clique computation, in terms of both time complexity and practical performance. Moreover, we separate the techniques required for achieving the time complexity from others purely used for practical performance consideration; this design choice may facilitate the research community to further improve the practical efficiency while not sacrificing the worst case time complexity. In specific, we first develop a general framework kDC that beats the trivial time complexity of $O(2^n)$ and achieves a better time complexity than all existing algorithms. The time complexity of kDC is solely achieved by non-fully-adjacent-first branching rule, excess-removal reduction rule and high-degree reduction rule. Then, to make kDC practically efficient, we further propose a new upper bound, two reduction rules, and an algorithm for efficiently computing a large initial solution. Extensive empirical studies on three benchmark graph collections with $290$ graphs in total demonstrate that kDC outperforms the currently fastest algorithm KDBB by several orders of magnitude.

翻译：$k$-缺陷团通过允许团中最多缺失$k$条边来放松完全图的约束。这种松弛使我们能够找到更大的近团结构，并在链路预测、聚类检测、社交网络分析和交通科学等领域具有应用价值。寻找最大$k$-缺陷团的问题近年来受到关注，文献中已提出若干算法。然而，当前最快的算法KDBB并未将其时间复杂度从平凡的$O(2^n)$提升，且其实用性能仍不令人满意。本文从时间复杂度和实用性能两方面推动了精确最大$k$-缺陷团计算的最新技术发展。此外，我们将实现时间复杂度所需的技术与纯粹用于提升实用性能的技术分离；这一设计选择有助于研究社区在保证最坏情况时间复杂度不劣化的前提下，进一步提升实用效率。具体而言，我们首先开发了一个通用框架kDC，其打破了$O(2^n)$的平凡时间复杂度，并实现了优于所有现有算法的时间复杂度。kDC的时间复杂度完全由非全邻优先分支规则、超额去除约简规则和高阶约简规则实现。为使kDC具备实用高效性，我们进一步提出了一种新的上界、两种约简规则以及一种高效计算大规模初始解的算法。在包含三个基准图集合、总计290张图的广泛实证研究表明，kDC的性能比当前最快的算法KDBB高出数个数量级。