Sample compression schemes were defined by Littlestone and Warmuth (1986) as an abstraction of the structure underlying many learning algorithms. In a sample compression scheme, we are given a large sample of vertices of a fixed hypergraph with labels indicating the containment in some hyperedge. The task is to compress the sample in such a way that we can retrieve the labels of the original sample. The size of a sample compression scheme is the amount of information that is kept in the compression. Every hypergraph with a sample compression scheme of bounded size must have bounded VC-dimension. Conversely, Moran and Yehudayoff (J. ACM, 2016) showed that every hypergraph of bounded VC-dimension admits a sample compression scheme of bounded size. We study a specific class of hypergraphs emerging from balls in graphs. The schemes that we construct (contrary to the ones constructed by Moran and Yehudayoff) are \textit{proper}, meaning that we retrieve not only the labeling of the original sample but also a hyperedge (ball) consistent with the original labeling. First, we prove that for every graph $G$ of treewidth at most $t$, the hypergraph of balls in $G$ has a proper sample compression scheme of size $\mathcal{O}(t\log t)$; this is tight up to the logarithmic factor and improves the quadratic (improper) bound that follows from the result of Moran and Yehudayoff. Second, we prove an analogous result for graphs of cliquewidth at most $t$.
翻译:样本压缩方案由Littlestone和Warmuth(1986)定义,作为许多学习算法底层结构的抽象。在样本压缩方案中,我们给定一个固定超图的大量顶点样本,并附有指示其是否包含于某个超边内的标签。任务是压缩该样本,使得能够恢复原始样本的标签。样本压缩方案的大小是压缩中保留的信息量。每个具有有界大小样本压缩方案的超图必须具有有界的VC维。反之,Moran和Yehudayoff(J. ACM, 2016)证明了每个有界VC维的超图都存在有界大小的样本压缩方案。我们研究了由图中球衍生出的一类特定超图。我们构造的方案(与Moran和Yehudayoff构造的方案相反)是“恰当的”,这意味着我们不仅能恢复原始样本的标签,还能恢复一个与原始标签一致的超边(球)。首先,我们证明:对于每个树宽至多为$t$的图$G$,$G$中球的超图具有大小为$\mathcal{O}(t\log t)$的恰当样本压缩方案;该结果紧至对数因子,并改进了由Moran和Yehudayoff结果导出的二次(非恰当)界。其次,我们证明了关于团宽至多为$t$的图的类似结果。