In this paper, we approach the problem of uncertainty quantification in deep learning through a predictive framework, which captures uncertainty in model parameters by specifying our assumptions about the predictive distribution of unseen future data. Under this view, we show that deep ensembling (Lakshminarayanan et al., 2017) is a fundamentally mis-specified model class, since it assumes that future data are supported on existing observations only -- a situation rarely encountered in practice. To address this limitation, we propose MixupMP, a method that constructs a more realistic predictive distribution using popular data augmentation techniques. MixupMP operates as a drop-in replacement for deep ensembles, where each ensemble member is trained on a random simulation from this predictive distribution. Grounded in the recently-proposed framework of Martingale posteriors (Fong et al., 2023), MixupMP returns samples from an implicitly defined Bayesian posterior. Our empirical analysis showcases that MixupMP achieves superior predictive performance and uncertainty quantification on various image classification datasets, when compared with existing Bayesian and non-Bayesian approaches.
翻译:本文通过预测框架来研究深度学习中的不确定性量化问题,该框架通过指定对未见未来数据预测分布的假设来捕获模型参数的不确定性。基于这一观点,我们证明深度集成(Lakshminarayanan 等人,2017)本质上是一类错误设定的模型,因为它假设未来数据仅存在于已有观测之上——这种情形在实践中极为罕见。为解决这一局限,我们提出 MixupMP 方法,该方法利用流行的数据增强技术构建更符合实际的预测分布。MixupMP 可作为深度集成的即插即用替代方案,其中每个集成成员基于该预测分布的随机模拟进行训练。基于近期提出的鞅后验框架(Fong 等人,2023),MixupMP 从隐式定义的贝叶斯后验中生成采样。实验分析表明,在多个图像分类数据集上与现有贝叶斯和非贝叶斯方法相比,MixupMP 在预测性能和不确定性量化方面均表现更优。