We introduce a set of gradient-flow-guided adaptive importance sampling (IS) transformations to stabilize Monte-Carlo approximations of point-wise leave one out cross-validated (LOO) predictions for Bayesian classification models. One can leverage this methodology for assessing model generalizability by for instance computing a LOO analogue to the AIC or computing LOO ROC/PRC curves and derived metrics like the AUROC and AUPRC. By the calculus of variations and gradient flow, we derive two simple nonlinear single-step transformations that utilize gradient information to shift a model's pre-trained full-data posterior closer to the target LOO posterior predictive distributions. In doing so, the transformations stabilize importance weights. Because the transformations involve the gradient of the likelihood function, the resulting Monte Carlo integral depends on Jacobian determinants with respect to the model Hessian. We derive closed-form exact formulae for these Jacobian determinants in the cases of logistic regression and shallow ReLU-activated artificial neural networks, and provide a simple approximation that sidesteps the need to compute full Hessian matrices and their spectra. We test the methodology on an $n\ll p$ dataset that is known to produce unstable LOO IS weights.
翻译:我们引入一组基于梯度流引导的自适应重要性采样(IS)变换,以稳定贝叶斯分类模型中逐点留一交叉验证(LOO)预测的蒙特卡洛近似。该方法可用于评估模型泛化能力,例如计算LOO版本的AIC、LOO的ROC/PRC曲线及其衍生指标(如AUROC和AUPRC)。通过变分法和梯度流,我们推导出两个简单的非线性单步变换,利用梯度信息将模型预训练的全数据后验分布向目标LOO后验预测分布移动,从而稳定重要性权重。由于变换涉及似然函数的梯度,所得蒙特卡洛积分依赖于模型Hessian矩阵的雅可比行列式。我们针对逻辑回归和浅层ReLU激活人工神经网络,给出了这些雅可比行列式的闭式精确解,并提出一种避免计算完整Hessian矩阵及其谱的简单近似方法。我们在一组已知会产生不稳定LOO重要性权重的$n\ll p$数据集上测试了该方法的有效性。