Sum-product networks (SPNs) are probabilistic models characterized by exact and fast evaluation of fundamental probabilistic operations. Its superior computational tractability has led to applications in many fields, such as machine learning with time constraints or accuracy requirements and real-time systems. The structural constraints of SPNs supporting fast inference, however, lead to increased learning-time complexity and can be an obstacle to building highly expressive SPNs. This study aimed to develop a Bayesian learning approach that can be efficiently implemented on large-scale SPNs. We derived a new full conditional probability of Gibbs sampling by marginalizing multiple random variables to expeditiously obtain the posterior distribution. The complexity analysis revealed that our sampling algorithm works efficiently even for the largest possible SPN. Furthermore, we proposed a hyperparameter tuning method that balances the diversity of the prior distribution and optimization efficiency in large-scale SPNs. Our method has improved learning-time complexity and demonstrated computational speed tens to more than one hundred times faster and superior predictive performance in numerical experiments on more than 20 datasets.
翻译:和积网络(SPNs)是一种能够精确高效执行基础概率运算的概率模型。其卓越的计算可处理性使其在诸多领域得到应用,例如具有时间约束或精度要求的机器学习以及实时系统。然而,支持快速推理的SPN结构约束会导致学习时间复杂度的增加,并可能成为构建高表达能力SPN的障碍。本研究旨在开发一种可在大规模SPN上高效实现的贝叶斯学习方法。我们通过边缘化多个随机变量,推导出一种新的吉布斯采样全条件概率,以快速获得后验分布。复杂度分析表明,即使对于最大可能的SPN,我们的采样算法也能高效工作。此外,我们提出了一种超参数调优方法,以平衡大规模SPN中先验分布的多样性与优化效率。我们的方法改善了学习时间复杂度,在超过20个数据集上的数值实验中,其计算速度提升了数十倍至百倍以上,并展现出更优的预测性能。