Bayesian cross-validation by parallel Markov Chain Monte Carlo

Brute force cross-validation (CV) is a method for predictive assessment and model selection that is general and applicable to a wide range of Bayesian models. However, in many cases brute force CV is too computationally burdensome to form part of interactive modeling workflows, especially when inference relies on Markov chain Monte Carlo (MCMC). In this paper we present a method for conducting fast Bayesian CV by massively parallel MCMC. On suitable accelerator hardware, for many applications our approach is about as fast (in wall clock time) as a single full-data model fit. Parallel CV is more flexible than existing fast CV approximation methods because it can easily exploit a wide range of scoring rules and data partitioning schemes. This is particularly useful for CV methods designed for non-exchangeable data. Our approach also delivers accurate estimates of Monte Carlo and CV uncertainty. In addition to parallelizing computations, parallel CV speeds up inference by reusing information from earlier MCMC adaptation and inference obtained during initial model fitting and checking of the full-data model. We propose MCMC diagnostics for parallel CV applications, including a summary of MCMC mixing based on the popular potential scale reduction factor ($\hat{R}$) and MCMC effective sample size ($\widehat{ESS}$) measures. Furthermore, we describe a method for determining whether an $\hat{R}$ diagnostic indicates approximate stationarity of the chains, that may be of more general interest for applications beyond parallel CV. For parallel CV to work on memory-constrained computing accelerators, we show that parallel CV and associated diagnostics can be implemented using online (streaming) algorithms ideal for parallel computing environments with limited memory. Constant memory algorithms allow parallel CV to scale up to very large blocking designs.

翻译：暴力交叉验证是一种用于预测评估和模型选择的通用方法，可广泛应用于各类贝叶斯模型。然而在许多情况下，尤其是当推断依赖于马尔可夫链蒙特卡洛方法时，暴力交叉验证的计算负担过重，难以融入交互式建模流程。本文提出了一种通过大规模并行MCMC实现快速贝叶斯交叉验证的方法。在合适的加速硬件上，对于许多应用而言，本方法的运行速度（按墙钟时间计算）与单次完整数据模型拟合相当。并行交叉验证比现有快速交叉验证近似方法更具灵活性，因为它能轻松利用各种评分规则和数据划分方案。这对于针对非交换数据设计的交叉验证方法尤为有用。本方法还能准确估算蒙特卡洛误差和交叉验证不确定性。除了实现计算并行化外，并行交叉验证通过复用早期MCMC自适应阶段及初始完整数据模型拟合与检验阶段获取的推断信息来加速推断过程。我们针对并行交叉验证应用提出了MCMC诊断方法，其中包括基于流行的潜在尺度缩减因子（$\hat{R}$）和MCMC有效样本量（$\widehat{ESS}$）指标的MCMC混合度总结。此外，我们描述了一种判断$\hat{R}$诊断指标是否表明链已近似平稳的方法，该方法可能对并行交叉验证之外的应用具有更广泛的参考价值。为使并行交叉验证能在内存受限的计算加速器上运行，我们证明并行交叉验证及相关诊断可采用在线（流式）算法实现，这些算法特别适用于内存有限的并行计算环境。恒定内存算法使并行交叉验证能够扩展至超大规模分块设计。