Standard Bayesian learning is known to have suboptimal generalization capabilities under misspecification and in the presence of outliers. PAC-Bayes theory demonstrates that the free energy criterion minimized by Bayesian learning is a bound on the generalization error for Gibbs predictors (i.e., for single models drawn at random from the posterior) under the assumption of sampling distributions uncontaminated by outliers. This viewpoint provides a justification for the limitations of Bayesian learning when the model is misspecified, requiring ensembling, and when data is affected by outliers. In recent work, PAC-Bayes bounds -- referred to as PAC$^m$ -- were derived to introduce free energy metrics that account for the performance of ensemble predictors, obtaining enhanced performance under misspecification. This work presents a novel robust free energy criterion that combines the generalized logarithm score function with PAC$^m$ ensemble bounds. The proposed free energy training criterion produces predictive distributions that are able to concurrently counteract the detrimental effects of misspecification -- with respect to both likelihood and prior distribution -- and outliers.
翻译:标准贝叶斯学习在模型误设定和存在异常值时表现出次优的泛化能力。PAC-Bayes理论表明,在采样分布未被异常值污染的前提下,贝叶斯学习所最小化的自由能准则是对吉布斯预测器(即从后验分布中随机抽取的单一模型)泛化误差的约束。这一观点为贝叶斯学习在模型误设定时需要集成、以及数据受异常值影响时存在的局限性提供了依据。为引入能反映集成预测器性能的自由能度量,近期研究推导出被称为PAC$^m$的PAC-Bayes界,在模型误设定下取得了更优性能。本文提出一种结合广义对数评分函数与PAC$^m$集成界的鲁棒自由能准则。该自由能训练准则能生成可同时抵消模型误设定(包括似然函数与先验分布两个方面)和异常值负面影响的预测分布。