This study tackles the efficient estimation of Kullback-Leibler (KL) Divergence in Dirichlet Mixture Models (DMM), crucial for clustering compositional data. Despite the significance of DMMs, obtaining an analytically tractable solution for KL Divergence has proven elusive. Past approaches relied on computationally demanding Monte Carlo methods, motivating our introduction of a novel variational approach. Our method offers a closed-form solution, significantly enhancing computational efficiency for swift model comparisons and robust estimation evaluations. Validation using real and simulated data showcases its superior efficiency and accuracy over traditional Monte Carlo-based methods, opening new avenues for rapid exploration of diverse DMM models and advancing statistical analyses of compositional data.
翻译:本研究旨在解决狄利克雷混合模型(DMM)中库尔贝克-莱布勒散度(KL散度)的高效估计问题,该模型对于成分数据的聚类分析至关重要。尽管DMM具有重要意义,但获得KL散度的解析可解方案一直难以实现。以往方法依赖计算密集型的蒙特卡洛方法,这促使我们提出一种新颖的变分方法。该方法提供闭式解,显著提升了计算效率,从而能够快速进行模型比较和稳健的估计评估。基于真实与模拟数据的验证表明,该方法在效率和准确性上均优于传统的蒙特卡洛方法,为快速探索各类DMM模型开辟了新途径,并推动了成分数据的统计分析发展。