We propose a novel method for measuring the discrepancy between a set of samples and a desired posterior distribution for Bayesian inference. Classical methods for assessing sample quality like the effective sample size are not appropriate for scalable Bayesian sampling algorithms, such as stochastic gradient Langevin dynamics, that are asymptotically biased. Instead, the gold standard is to use the kernel Stein Discrepancy (KSD), which is itself not scalable given its quadratic cost in the number of samples. The KSD and its faster extensions also typically suffer from the curse of dimensionality and can require extensive tuning. To address these limitations, we develop the polynomial Stein discrepancy (PSD) and an associated goodness-of-fit test. While the new test is not fully convergence-determining, we prove that it detects differences in the first r moments for Gaussian targets. We empirically show that the test has higher power than its competitors in several examples, and at a lower computational cost. Finally, we demonstrate that the PSD can assist practitioners to select hyper-parameters of Bayesian sampling algorithms more efficiently than competitors.
翻译:我们提出了一种新颖的方法,用于衡量贝叶斯推断中样本集与期望后验分布之间的差异。经典的样本质量评估方法(如有效样本量)不适用于可扩展的贝叶斯采样算法(例如随机梯度朗之万动力学),这些算法存在渐近偏差。相反,金标准是使用核斯坦因差异(KSD),但该方法因样本数量的二次方成本而缺乏可扩展性。KSD及其快速扩展通常还面临维度灾难,并需要大量调参。为解决这些局限,我们开发了多项式斯坦因差异(PSD)及其关联的拟合优度检验。尽管新检验并非完全收敛确定性,但我们证明它能检测高斯目标的前r阶矩差异。实验结果表明,在多组示例中,该检验在更低的计算成本下实现了比竞争对手更高的统计功效。最后,我们证明PSD可帮助实践者比同类方法更高效地选择贝叶斯采样算法的超参数。