The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset. Classical information criteria typically consider the large-data limit, penalizing model size. However, these criteria are not appropriate in modern settings where overparameterized models tend to perform well. For any overparameterized model, we show that there exists a dual underparameterized model that possesses the same marginal likelihood, thus establishing a form of Bayesian duality. This enables more classical methods to be used in the overparameterized setting, revealing the Interpolating Information Criterion, a measure of model quality that naturally incorporates the choice of prior into the model selection. Our new information criterion accounts for prior misspecification, geometric and spectral properties of the model, and is numerically consistent with known empirical and theoretical behavior in this regime.
翻译:本文研究了插值估计器背景下的模型选择问题,其中模型参数数量超过数据集规模。经典信息准则通常考虑大数据极限,对模型规模施加惩罚。然而,这些准则不适用于过参数化模型往往表现优异的现代场景。对于任意过参数化模型,我们证明存在一个对偶的欠参数化模型具有相同的边缘似然,从而建立了一种贝叶斯对偶形式。这使得更经典的方法能够应用于过参数化场景,并揭示了插值信息准则——一种自然将先验选择纳入模型选择的模型质量度量标准。我们提出的新信息准则考虑了先验设定错误、模型的几何与谱特性,其数值结果与该领域已知的经验和理论行为保持一致。