Deep neural networks have become the method of choice for solving many classification tasks, largely because they can fit very complex functions defined over raw data. The downside of such powerful learners is the danger of overfit. In this paper, we introduce a novel ensemble classifier for deep networks that effectively overcomes overfitting by combining models generated at specific intermediate epochs during training. Our method allows for the incorporation of useful knowledge obtained by the models during the overfitting phase without deterioration of the general performance, which is usually missed when early stopping is used. To motivate this approach, we begin with the theoretical analysis of a regression model, whose prediction -- that the variance among classifiers increases when overfit occurs -- is demonstrated empirically in deep networks in common use. Guided by these results, we construct a new ensemble-based prediction method, where the prediction is determined by the class that attains the most consensual prediction throughout the training epochs. Using multiple image and text classification datasets, we show that when regular ensembles suffer from overfit, our method eliminates the harmful reduction in generalization due to overfit, and often even surpasses the performance obtained by early stopping. Our method is easy to implement and can be integrated with any training scheme and architecture, without additional prior knowledge beyond the training set. It is thus a practical and useful tool to overcome overfit. Code is available at https://github.com/uristern123/United-We-Stand-Using-Epoch-wise-Agreement-of-Ensembles-to-Combat-Overfit.
翻译:深度神经网络已成为解决许多分类任务的首选方法,这主要归功于其能够拟合原始数据上定义的复杂函数。然而,这种强大学习器的弊端在于存在过拟合风险。本文提出了一种针对深度网络的新型集成分类器,通过组合训练过程中特定中间轮次生成的模型,有效克服了过拟合问题。我们的方法能够纳入模型在过拟合阶段获得的有用知识,同时避免整体性能下降——而这正是早停法通常会丧失的。为论证该方法,我们首先对回归模型进行理论分析,其预测结论——即过拟合发生时分类器间方差增大——在常用深度网络中得到实证验证。基于这些结果,我们构建了一种基于集成的新型预测方法,其预测结果由训练轮次中获得最多共识的类别决定。通过在多个图像和文本分类数据集上的实验表明,当常规集成模型遭遇过拟合时,我们的方法能够消除过拟合导致的泛化性能损失,甚至经常超越早停法所获得的性能。该方法易于实现,可无缝集成至任何训练方案与架构中,除训练集外无需额外先验知识,因此是一种实用且有效的过拟合对抗工具。代码已开源至 https://github.com/uristern123/United-We-Stand-Using-Epoch-wise-Agreement-of-Ensembles-to-Combat-Overfit。