The sample compression theory provides generalization guarantees for predictors that can be fully defined using a subset of the training dataset and a (short) message string, generally defined as a binary sequence. Previous works provided generalization bounds for the zero-one loss, which is restrictive, notably when applied to deep learning approaches. In this paper, we present a general framework for deriving new sample compression bounds that hold for real-valued losses. We empirically demonstrate the tightness of the bounds and their versatility by evaluating them on different types of models, e.g., neural networks and decision forests, trained with the Pick-To-Learn (P2L) meta-algorithm, which transforms the training method of any machine-learning predictor to yield sample-compressed predictors. In contrast to existing P2L bounds, ours are valid in the non-consistent case.
翻译:样本压缩理论为那些能够完全由训练数据子集和(短)消息字符串(通常定义为二进制序列)定义的预测器提供了泛化保证。先前的研究为零一损失提供了泛化界,但这具有局限性,特别是在应用于深度学习方法时。本文提出了一个通用框架,用于推导适用于实值损失的新样本压缩界。我们通过在不同类型的模型(例如,神经网络和决策森林)上评估这些界,实证证明了其紧致性和通用性,这些模型使用Pick-To-Learn (P2L)元算法进行训练,该算法能够转换任何机器学习预测器的训练方法,以产生样本压缩的预测器。与现有的P2L界相比,我们的界在非一致情况下仍然有效。