Deep neural networks have become the method of choice for solving many image classification tasks, largely because they can fit very complex functions defined over raw images. The downside of such powerful learners is the danger of overfitting the training set, leading to poor generalization, which is usually avoided by regularization and "early stopping" of the training. In this paper, we propose a new deep network ensemble classifier that is very effective against overfit. We begin with the theoretical analysis of a regression model, whose predictions - that the variance among classifiers increases when overfit occurs - is demonstrated empirically in deep networks in common use. Guided by these results, we construct a new ensemble-based prediction method designed to combat overfit, where the prediction is determined by the most consensual prediction throughout the training. On multiple image and text classification datasets, we show that when regular ensembles suffer from overfit, our method eliminates the harmful reduction in generalization due to overfit, and often even surpasses the performance obtained by early stopping. Our method is easy to implement, and can be integrated with any training scheme and architecture, without additional prior knowledge beyond the training set. Accordingly, it is a practical and useful tool to overcome overfit.
翻译:深度神经网络已成为解决众多图像分类任务的首选方法,这主要得益于其能够拟合原始图像上定义的复杂函数。然而,这种强大学习能力的代价是容易对训练集产生过拟合,导致泛化性能下降,通常通过正则化和训练"早停法"来避免。本文提出了一种能有效对抗过拟合的新型深度网络集成分类器。我们首先对回归模型进行理论分析,其预测——当发生过拟合时分类器间的方差会增加——在常用深度网络中得到了实验验证。基于这些结果,我们构建了一种新的集成预测方法以对抗过拟合,该方法通过训练过程中最具共识的预测来确定最终结果。在多个图像和文本分类数据集上的实验表明,当常规集成模型遭遇过拟合时,我们的方法能消除过拟合导致的泛化性能下降,甚至常能超越早停法获得的最佳性能。该方法易于实现,可集成至任意训练框架和网络架构中,且无需训练集之外的先验知识。因此,它是克服过拟合问题的实用有效工具。