We show that the VC-dimension of a graph can be computed in time $n^{\log d+1} d^{O(d)}$, where $d$ is the degeneracy of the input graph. The core idea of our algorithm is a data structure to efficiently query the number of vertices that see a specific subset of vertices inside of a (small) query set. The construction of this data structure takes time $O(d2^dn)$, afterwards queries can be computed efficiently using fast M\"obius inversion. This data structure turns out to be useful for a range of tasks, especially for finding bipartite patterns in degenerate graphs, and we outline an efficient algorithms for counting the number of times specific patterns occur in a graph. The largest factor in the running time of this algorithm is $O(n^c)$, where $c$ is a parameter of the pattern we call its left covering number. Concrete applications of this algorithm include counting the number of (non-induced) bicliques in linear time, the number of co-matchings in quadratic time, as well as a constant-factor approximation of the ladder index in linear time. Finally, we supplement our theoretical results with several implementations and run experiments on more than 200 real-world datasets -- the largest of which has 8 million edges -- where we obtain interesting insights into the VC-dimension of real-world networks.
翻译:我们证明,图的VC-维可以在$n^{\log d+1} d^{O(d)}$时间内计算,其中$d$是输入图的退化度。算法的核心思想是设计一种数据结构,用于高效查询在(小规模)查询集内部,能够"看到"特定子集的顶点数量。该数据结构的构建需要$O(d2^dn)$时间,随后可通过快速莫比乌斯反演高效执行查询。该数据结构对一系列任务均具实用性,尤其在退化图中寻找二分模式方面,我们据此提出一种高效算法用于统计图中特定模式出现的次数。该算法运行时间中的最大因子为$O(n^c)$,其中$c$是模式的一个参数,称为其左覆盖数。该算法的具体应用包括:在线性时间内统计(非诱导)双团数量,在二次时间内统计共匹配数量,以及在线性时间内实现梯子指数的常数因子近似。最后,我们通过多项实现和超过200个真实数据集(最大者含800万条边)的实验补充理论结果,从中获得关于真实世界网络VC-维的有趣洞见。