Testing for Overfitting

High complexity models are notorious in machine learning for overfitting, a phenomenon in which models well represent data but fail to generalize an underlying data generating process. A typical procedure for circumventing overfitting computes empirical risk on a holdout set and halts once (or flags that/when) it begins to increase. Such practice often helps in outputting a well-generalizing model, but justification for why it works is primarily heuristic. We discuss the overfitting problem and explain why standard asymptotic and concentration results do not hold for evaluation with training data. We then proceed to introduce and argue for a hypothesis test by means of which both model performance may be evaluated using training data, and overfitting quantitatively defined and detected. We rely on said concentration bounds which guarantee that empirical means should, with high probability, approximate their true mean to conclude that they should approximate each other. We stipulate conditions under which this test is valid, describe how the test may be used for identifying overfitting, articulate a further nuance according to which distributional shift may be flagged, and highlight an alternative notion of learning which usefully captures generalization in the absence of uniform PAC guarantees.

翻译：高复杂度模型在机器学习中因过拟合而臭名昭著，即模型能良好表征数据却无法泛化至潜在的数据生成过程。规避过拟合的典型流程是在留存集上计算经验风险，并在该风险开始上升时立即终止（或标记该时刻/阶段）。这种做法常有助于输出泛化良好的模型，但其有效性的论证主要基于启发式经验。我们讨论了过拟合问题，并解释了为何标准渐近性与集中性结论不能用于训练数据的评估。随后，我们提出并论证了一种假设检验方法，该方法既能利用训练数据评估模型性能，又能定量定义和检测过拟合。我们依赖于所述集中性界——这些界保证了经验均值应以高概率近似其真实均值——进而推断这些经验均值应彼此近似。我们规定了该检验有效的条件，描述了如何用其识别过拟合，阐述了用于标记分布偏移的进一步细微之处，并强调了在缺乏统一PAC保证情况下能有效捕捉泛化性的另一种学习概念。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

专知会员服务

39+阅读 · 2020年11月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【MLA 2019】机器学习中分布式鲁棒优化的一阶算法框架( Towards a First-Order Algorithmic Framework for Distributionally Robust Optimization in Machine Learning),香港中文大学苏文藻

专知会员服务

28+阅读 · 2019年11月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation