The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset. Classical information criteria typically consider the large-data limit, penalizing model size. However, these criteria are not appropriate in modern settings where overparameterized models tend to perform well. For any overparameterized model, we show that there exists a dual underparameterized model that possesses the same marginal likelihood, thus establishing a form of Bayesian duality. This enables more classical methods to be used in the overparameterized setting, revealing the Interpolating Information Criterion, a measure of model quality that naturally incorporates the choice of prior into the model selection. Our new information criterion accounts for prior misspecification, geometric and spectral properties of the model, and is numerically consistent with known empirical and theoretical behavior in this regime.
翻译:考虑在插值估计器设置下的模型选择问题,其中模型参数数量超过数据集规模。经典信息准则通常考虑大数据极限,对模型规模进行惩罚。然而,这些准则并不适用于过参数化模型通常表现良好的现代场景。对于任何过参数化模型,我们证明存在一个具有相同边际似然的对偶欠参数化模型,从而建立了一种贝叶斯对偶形式。这使得经典方法能够应用于过参数化设置,揭示了插值信息准则——一种自然地将先验选择纳入模型选择的模型质量度量方式。我们提出的新信息准则考虑了先验误设、模型的几何与谱特性,并在该场景下与已知的经验和理论行为数值一致。