Mixture models are a standard tool in statistical analyses, widely used for density modeling and model-based clustering. In this work, we propose a Bayesian mixture model with repulsion between mixture components. Such repulsion helps address the problem of overlapping or poorly separated clusters, and assists with model interpretibility and robustness. Our modeling approach introduces repulsion via a generalized Matérn type-III repulsive point process model, and proceeds by applying a dependent sequential thinning scheme to a latent Poisson point process. A key feature of our model is that in contrast to most existing approaches to modeling repulsion, efficient posterior inference is possible via a Gibbs sampler, one that exploits the latent Poisson of our problem. This novel sampler also allows posterior inference over the number of clusters, and is of independent interest even in standard clustering applications without repulsion. We demonstrate the utility of the proposed method on a number of synthetic and real-world problems.
翻译:混合模型是统计分析中的标准工具,广泛应用于密度建模和基于模型的聚类分析。本研究提出了一种在混合分量间引入排斥作用的贝叶斯混合模型。这种排斥机制有助于解决聚类重叠或分离不佳的问题,同时提升模型的可解释性和鲁棒性。我们的建模方法通过广义Matérn III型排斥点过程模型引入排斥效应,具体通过对潜在泊松点过程实施依赖序列稀释方案来实现。本模型的关键特征在于:与现有大多数排斥建模方法不同,我们能够通过吉布斯采样器进行高效后验推断——该采样器充分利用了问题中的潜在泊松结构。这种新颖的采样器还支持对聚类数量的后验推断,即使在无排斥作用的标准聚类应用中亦具有独立价值。我们通过多个合成数据集和实际案例验证了所提方法的实用性。