Core decomposition is a classic technique for discovering densely connected regions in a graph with large range of applications. Formally, a $k$-core is a maximal subgraph where each vertex has at least $k$ neighbors. A natural extension of a $k$-core is a $(k, h)$-core, where each node must have at least $k$ nodes that can be reached with a path of length $h$. The downside in using $(k, h)$-core decomposition is the significant increase in the computational complexity: whereas the standard core decomposition can be done in $O(m)$ time, the generalization can require $O(n^2m)$ time, where $n$ and $m$ are the number of nodes and edges in the given graph. In this paper we propose a randomized algorithm that produces an $\epsilon$-approximation of $(k, h)$ core decomposition with a probability of $1 - \delta$ in $O(\epsilon^{-2} hm (\log^2 n - \log \delta))$ time. The approximation is based on sampling the neighborhoods of nodes, and we use Chernoff bound to prove the approximation guarantee. We also study distance-generalized dense subgraphs, show that the problem is NP-hard, provide an algorithm for discovering such graphs with approximate core decompositions, and provide theoretical guarantees for the quality of the discovered subgraphs. We demonstrate empirically that approximating the decomposition complements the exact computation: computing the approximation is significantly faster than computing the exact solution for the networks where computing the exact solution is slow
翻译:核分解是一种发现图中稠密连通区域的经典技术,具有广泛的应用。形式上,$k$-核是一个最大子图,其中每个顶点至少有$k$个邻居。$k$-核的一个自然推广是$(k, h)$-核,其中每个节点必须至少有$k$个节点可通过长度为$h$的路径到达。使用$(k, h)$-核分解的缺点在于计算复杂度显著增加:标准核分解可在$O(m)$时间内完成,而泛化版本可能需要$O(n^2m)$时间,其中$n$和$m$分别是给定图中的节点数和边数。本文提出一种随机化算法,可在$O(\epsilon^{-2} hm (\log^2 n - \log \delta))$时间内以$1 - \delta$的概率生成$(k, h)$-核分解的$\epsilon$-近似。该近似基于对节点邻域的抽样,我们利用Chernoff界来证明近似保证。我们还研究了距离泛化稠密子图,证明该问题是NP难的,提供了一种利用近似核分解发现此类子图的算法,并给出了所发现子图质量的理论保证。通过实验表明,近似分解是对精确计算的有益补充:在精确解计算缓慢的网络中,计算近似解的速度显著快于计算精确解。