Bayesian methods are often optimal, yet increasing pressure for fast computations, especially with streaming data, brings renewed interest in faster, possibly sub-optimal, solutions. The extent to which these algorithms approximate Bayesian solutions is a question of interest, but often unanswered. We propose a methodology to address this question in predictive settings, when the algorithm can be reinterpreted as a probabilistic predictive rule. We specifically develop the proposed methodology for a recursive procedure for online learning in nonparametric mixture models, often refereed to as Newton's algorithm. This algorithm is simple and fast; however, its approximation properties are unclear. By reinterpreting it as a predictive rule, we can show that it underlies a statistical model which is, asymptotically, a Bayesian, exchangeable mixture model. In this sense, the recursive rule provides a quasi-Bayes solution. While the algorithm only offers a point estimate, our clean statistical formulation allows us to provide the asymptotic posterior distribution and asymptotic credible intervals for the mixing distribution. Moreover, it gives insights for tuning the parameters, as we illustrate in simulation studies, and paves the way to extensions in various directions. Beyond mixture models, our approach can be applied to other predictive algorithms.
翻译:贝叶斯方法通常是最优的,然而对快速计算日益增长的需求,尤其是在流数据场景下,重新激发了人们对更快(可能次优)解决方案的兴趣。这些算法在多大程度上逼近贝叶斯解是一个值得关注但常未得到解答的问题。本文提出一种方法论,用于在预测场景中探讨该问题——当算法可被重新解释为概率性预测规则时。我们特别针对非参数混合模型中的在线学习递归过程(常被称为牛顿算法)发展了所提出的方法论。该算法简单快速,但其逼近性质尚不明确。通过将其重新解释为预测规则,我们可以证明其背后存在一个渐近意义上的贝叶斯可交换混合模型。在此意义上,该递归规则提供了一种准贝叶斯解。虽然该算法仅提供点估计,但我们清晰的统计表述允许我们给出混合分布的渐近后验分布与渐近可信区间。此外,该方法为参数调优提供了理论依据(如我们在模拟研究中所展示),并为多方向扩展铺平了道路。除混合模型外,我们的方法可推广至其他预测算法。