Early stopping based on hold-out data is a popular regularization technique designed to mitigate overfitting and increase the predictive accuracy of neural networks. Models trained with early stopping often provide relatively accurate predictions, but they generally still lack precise statistical guarantees unless they are further calibrated using independent hold-out data. This paper addresses the above limitation with conformalized early stopping: a novel method that combines early stopping with conformal calibration while efficiently recycling the same hold-out data. This leads to models that are both accurate and able to provide exact predictive inferences without multiple data splits nor overly conservative adjustments. Practical implementations are developed for different learning tasks -- outlier detection, multi-class classification, regression -- and their competitive performance is demonstrated on real data.
翻译:基于保留数据的早停法是一种流行的正则化技术,旨在缓解过拟合并提高神经网络的预测精度。使用早停法训练的模型通常能提供相对准确的预测,但除非使用独立的保留数据进一步校准,否则这些模型往往仍缺乏精确的统计保证。本文通过符合性早停法解决了上述局限性:这是一种新颖的方法,它将早停法与符合性校准相结合,同时高效地复用相同的保留数据。这使得模型既能保持准确性,又能提供精确的预测推断,而无需进行多次数据分割或采用过度保守的调整。针对不同的学习任务——异常检测、多类分类、回归——开发了实际实现方案,并在真实数据上展示了其竞争性性能。