Valid confidence intervals for regression with best subset selection

Classical confidence intervals after best subset selection are widely implemented in statistical software and are routinely used to guide practitioners in scientific fields to conclude significance. However, there are increasing concerns in the recent literature about the validity of these confidence intervals in that the intended frequentist coverage is not attained. In the context of the Akaike information criterion (AIC), recent studies observe an under-coverage phenomenon in terms of overfitting, where the estimate of error variance under the selected submodel is smaller than that for the true model. Under-coverage is particularly troubling in selective inference as it points to inflated Type I errors that would invalidate significant findings. In this article, we delineate a complementary, yet provably more deciding factor behind the incorrect coverage of classical confidence intervals under AIC, in terms of altered conditional sampling distributions of pivotal quantities. Resting on selective techniques developed in other settings, our finite-sample characterization of the selection event under AIC uncovers its geometry as a union of finitely many intervals on the real line, based on which we derive new confidence intervals with guaranteed coverage for any sample size. This geometry derived for AIC selection enables exact (and typically less than exact) conditioning, circumventing the need for the excessive conditioning common in other post-selection methods. The proposed methods are easy to implement and can be broadly applied to other commonly used best subset selection criteria. In an application to a classical US consumption dataset, the proposed confidence intervals arrive at different conclusions compared to the conventional ones, even when the selected model is the full model, leading to interpretable findings that better align with empirical observations.

翻译：经典置信区间在最佳子集选择后被广泛应用于统计软件，并常规用于指导科学领域的实践者推断显著性。然而，近期文献中越来越关注这些置信区间的有效性，即其目标频率覆盖概率未能实现。在Akaike信息准则（AIC）的背景下，近期研究观察到一种欠覆盖现象，表现为所选子模型下的误差方差估计小于真实模型下的估计，这与过拟合相关。欠覆盖在选择性推断中尤其令人担忧，因为它指向膨胀的第一类错误，可能使显著性结论失效。本文阐明了一个互补但更为关键的导致AIC下经典置信区间覆盖不正确的因素，即枢轴量条件抽样分布的改变。基于其他场景中发展的选择性技术，我们对AIC下选择事件的有限样本刻画揭示了其几何结构为实数轴上有限区间的并集，并据此推导出保证任意样本量覆盖概率的新置信区间。为AIC选择推导的几何结构允许精确（且通常低于精确）的条件化，避免了其他后选择方法中常见的过度条件化问题。所提方法易于实现，并可广泛推广至其他常用的最佳子集选择准则。在应用于经典美国消费数据集时，即使所选模型为全模型，所提置信区间也得出了与传统方法不同的结论，产生了更符合经验观测的可解释性结果。