We introduce a generic estimator for the false discovery rate of any model selection procedure, in common statistical modeling settings including the Gaussian linear model, Gaussian graphical model, and model-X setting. We prove that our method has a conservative (non-negative) bias in finite samples under standard statistical assumptions, and provide a bootstrap method for assessing its standard error. For methods like the Lasso, forward-stepwise regression, and the graphical Lasso, our estimator serves as a valuable companion to cross-validation, illuminating the tradeoff between prediction error and variable selection accuracy as a function of the model complexity parameter.
翻译:本文提出了一种通用估计器,用于评估常见统计建模场景(包括高斯线性模型、高斯图模型及模型-X设定)中任意模型选择过程的错误发现率。我们在标准统计假设下证明了该方法在有限样本中具有保守性(非负偏差),并提供了评估其标准误差的自举方法。对于Lasso、前向逐步回归和图Lasso等方法,本估计器可作为交叉验证的重要补充,通过模型复杂度参数的函数关系,揭示预测误差与变量选择精度之间的权衡关系。