For any set system $H=(V,R), \ R \subseteq 2^V$, a subset $S \subseteq V$ is called \emph{shattered} if every $S' \subseteq S$ results from the intersection of $S$ with some set in $\R$. The \emph{VC-dimension} of $H$ is the size of a largest shattered set in $V$. In this paper, we focus on the problem of computing the VC-dimension of graphs. In particular, given a graph $G=(V,E)$, the VC-dimension of $G$ is defined as the VC-dimension of $(V, \mathcal N)$, where $\mathcal N$ contains each subset of $V$ that can be obtained as the closed neighborhood of some vertex $v \in V$ in $G$. Our main contribution is an algorithm for computing the VC-dimension of any graph, whose effectiveness is shown through experiments on various types of practical graphs, including graphs with millions of vertices. A key aspect of its efficiency resides in the fact that practical graphs have small VC-dimension, up to 8 in our experiments. As a side-product, we present several new bounds relating the graph VC-dimension to other classical graph theoretical notions. We also establish the $W[1]$-hardness of the graph VC-dimension problem by extending a previous result for arbitrary set systems.
翻译:对于任意集合系统$H=(V,R), \ R \subseteq 2^V$,若每个子集$S' \subseteq S$均可由$S$与$\R$中某个集合的交集得到,则称子集$S \subseteq V$为\textit{可破碎的}。$H$的\textit{VC维数}是$V$中最大可破碎子集的大小。本文聚焦于计算图的VC维数问题。具体而言,给定图$G=(V,E)$,其VC维数定义为$(V, \mathcal N)$的VC维数,其中$\mathcal N$包含$V$中所有能表示为$G$中某顶点$v \in V$闭邻域的子集。我们的主要贡献是提出了一种计算任意图VC维数的算法,并通过包含百万级顶点等多种实际图上的实验证明了其有效性。该算法效率的关键在于实际图的VC维数很小——在我们的实验中最高仅为8。作为副产品,我们提出了若干将图的VC维数与经典图论概念相联系的新界。此外,通过推广先前关于任意集合系统的结论,我们证明了图VC维数问题的$W[1]$-难度。