We propose a novel method for measuring the discrepancy between a set of samples and a desired posterior distribution for Bayesian inference. Classical methods for assessing sample quality like the effective sample size are not appropriate for scalable Bayesian sampling algorithms, such as stochastic gradient Langevin dynamics, that are asymptotically biased. Instead, the gold standard is to use the kernel Stein Discrepancy (KSD), which is itself not scalable given its quadratic cost in the number of samples. The KSD and its faster extensions also typically suffer from the curse-of-dimensionality and can require extensive tuning. To address these limitations, we develop the polynomial Stein discrepancy (PSD) and an associated goodness-of-fit test. While the new test is not fully convergence-determining, we prove that it detects differences in the first r moments in the Bernstein-von Mises limit. We empirically show that the test has higher power than its competitors in several examples, and at a lower computational cost. Finally, we demonstrate that the PSD can assist practitioners to select hyper-parameters of Bayesian sampling algorithms more efficiently than competitors.
翻译:我们提出了一种新颖的方法,用于衡量一组样本与贝叶斯推断中期望后验分布之间的差异。评估样本质量的经典方法(如有效样本量)不适用于可扩展的贝叶斯采样算法(例如随机梯度朗之万动力学),因为这些算法具有渐近偏差。相反,黄金标准是使用核斯坦因差异(KSD),但该方法本身由于样本数量的二次方成本而不可扩展。KSD及其快速扩展通常也受到维度灾难的影响,并且可能需要大量调参。为了解决这些局限性,我们开发了多项式斯坦因差异(PSD)及其相关的拟合优度检验。虽然新检验并非完全收敛确定性的,但我们证明在伯恩斯坦-冯·米塞斯极限下,它能检测前r阶矩的差异。我们通过实验表明,该检验在多个示例中比其竞争方法具有更高的检验功效,且计算成本更低。最后,我们证明PSD能够帮助实践者比竞争方法更高效地选择贝叶斯采样算法的超参数。