United We Stand: Using Epoch-wise Agreement of Ensembles to Combat Overfit

Deep neural networks have become the method of choice for solving many image classification tasks, largely because they can fit very complex functions defined over raw images. The downside of such powerful learners is the danger of overfitting the training set, leading to poor generalization, which is usually avoided by regularization and "early stopping" of the training. In this paper, we propose a new deep network ensemble classifier that is very effective against overfit. We begin with the theoretical analysis of a regression model, whose predictions - that the variance among classifiers increases when overfit occurs - is demonstrated empirically in deep networks in common use. Guided by these results, we construct a new ensemble-based prediction method designed to combat overfit, where the prediction is determined by the most consensual prediction throughout the training. On multiple image and text classification datasets, we show that when regular ensembles suffer from overfit, our method eliminates the harmful reduction in generalization due to overfit, and often even surpasses the performance obtained by early stopping. Our method is easy to implement, and can be integrated with any training scheme and architecture, without additional prior knowledge beyond the training set. Accordingly, it is a practical and useful tool to overcome overfit.

翻译：深度神经网络已成为解决众多图像分类任务的首选方法，这主要得益于其能够拟合原始图像上定义的复杂函数。然而，这种强大学习能力的代价是容易对训练集产生过拟合，导致泛化性能下降，通常通过正则化和训练"早停法"来避免。本文提出了一种能有效对抗过拟合的新型深度网络集成分类器。我们首先对回归模型进行理论分析，其预测——当发生过拟合时分类器间的方差会增加——在常用深度网络中得到了实验验证。基于这些结果，我们构建了一种新的集成预测方法以对抗过拟合，该方法通过训练过程中最具共识的预测来确定最终结果。在多个图像和文本分类数据集上的实验表明，当常规集成模型遭遇过拟合时，我们的方法能消除过拟合导致的泛化性能下降，甚至常能超越早停法获得的最佳性能。该方法易于实现，可集成至任意训练框架和网络架构中，且无需训练集之外的先验知识。因此，它是克服过拟合问题的实用有效工具。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日