We present novel reductions from sample compression schemes in multiclass classification, regression, and adversarially robust learning settings to binary sample compression schemes. Assuming we have a compression scheme for binary classes of size $f(d_\mathrm{VC})$, where $d_\mathrm{VC}$ is the VC dimension, then we have the following results: (1) If the binary compression scheme is a majority-vote or a stable compression scheme, then there exists a multiclass compression scheme of size $O(f(d_\mathrm{G}))$, where $d_\mathrm{G}$ is the graph dimension. Moreover, for general binary compression schemes, we obtain a compression of size $O(f(d_\mathrm{G})\log|Y|)$, where $Y$ is the label space. (2) If the binary compression scheme is a majority-vote or a stable compression scheme, then there exists an $\epsilon$-approximate compression scheme for regression over $[0,1]$-valued functions of size $O(f(d_\mathrm{P}))$, where $d_\mathrm{P}$ is the pseudo-dimension. For general binary compression schemes, we obtain a compression of size $O(f(d_\mathrm{P})\log(1/\epsilon))$. These results would have significant implications if the sample compression conjecture, which posits that any binary concept class with a finite VC dimension admits a binary compression scheme of size $O(d_\mathrm{VC})$, is resolved (Littlestone and Warmuth, 1986; Floyd and Warmuth, 1995; Warmuth, 2003). Our results would then extend the proof of the conjecture immediately to other settings. We establish similar results for adversarially robust learning and also provide an example of a concept class that is robustly learnable but has no bounded-size compression scheme, demonstrating that learnability is not equivalent to having a compression scheme independent of the sample size, unlike in binary classification, where compression of size $2^{O(d_\mathrm{VC})}$ is attainable (Moran and Yehudayoff, 2016).
翻译:我们提出了从多类别分类、回归以及对抗鲁棒学习场景中的样本压缩方案到二元样本压缩方案的新颖归约。假设我们有一个规模为 $f(d_\mathrm{VC})$ 的二元类别压缩方案,其中 $d_\mathrm{VC}$ 是 VC 维,那么我们得到以下结果:(1) 如果二元压缩方案是多数投票或稳定压缩方案,则存在一个规模为 $O(f(d_\mathrm{G}))$ 的多类别压缩方案,其中 $d_\mathrm{G}$ 是图维。此外,对于一般的二元压缩方案,我们获得一个规模为 $O(f(d_\mathrm{G})\log|Y|)$ 的压缩方案,其中 $Y$ 是标签空间。(2) 如果二元压缩方案是多数投票或稳定压缩方案,则存在一个用于 $[0,1]$ 值函数回归的 $\epsilon$ 近似压缩方案,其规模为 $O(f(d_\mathrm{P}))$,其中 $d_\mathrm{P}$ 是伪维。对于一般的二元压缩方案,我们获得一个规模为 $O(f(d_\mathrm{P})\log(1/\epsilon))$ 的压缩方案。如果样本压缩猜想(该猜想认为任何具有有限 VC 维的二元概念类都允许一个规模为 $O(d_\mathrm{VC})$ 的二元压缩方案)得到解决,这些结果将具有重要影响(Littlestone 和 Warmuth,1986;Floyd 和 Warmuth,1995;Warmuth,2003)。届时,我们的结果将立即把该猜想的证明扩展到其他场景。我们为对抗鲁棒学习建立了类似的结果,并提供了一个概念类的例子,该类是鲁棒可学习的,但没有有界规模的压缩方案,这表明可学习性并不等同于拥有一个与样本大小无关的压缩方案,这与二元分类不同,在二元分类中可以实现规模为 $2^{O(d_\mathrm{VC})}$ 的压缩(Moran 和 Yehudayoff,2016)。