Robustness and generalization ability of machine learning models are of utmost importance in various application domains. There is a wide interest in efficient ways to analyze those properties. One important direction is to analyze connection between those two properties. Prior theories suggest that a robust learning algorithm can produce trained models with a high generalization ability. However, we show in this work that the existing error bounds are vacuous for the Bayes optimal classifier which is the best among all measurable classifiers for a classification problem with overlapping classes. Those bounds cannot converge to the true error of this ideal classifier. This is undesirable, surprizing, and never known before. We then present a class of novel bounds, which are model-dependent and provably tighter than the existing robustness-based ones. Unlike prior ones, our bounds are guaranteed to converge to the true error of the best classifier, as the number of samples increases. We further provide an extensive experiment and find that two of our bounds are often non-vacuous for a large class of deep neural networks, pretrained from ImageNet.
翻译:机器学习模型的鲁棒性与泛化能力在各个应用领域至关重要。学界对高效分析这些性质的方法抱有广泛兴趣。一个重要研究方向是分析这两种性质间的关联。现有理论表明,鲁棒的学习算法能够产生具有高泛化能力的训练模型。然而,本工作证明,对于存在类别重叠的分类问题中所有可测分类器中的最优者——贝叶斯最优分类器,现有误差界是空泛的。这些界无法收敛至该理想分类器的真实误差。这一现象是不理想的、令人惊讶的,且此前从未被揭示。我们继而提出一类新型误差界,其具有模型依赖性,并严格证明其比现有基于鲁棒性的界更紧致。与先前研究不同,我们的界在样本量增加时能保证收敛至最优分类器的真实误差。我们进一步开展了大量实验,发现在ImageNet上预训练的大类深度神经网络中,我们提出的两个误差界通常是非空泛的。