The distance dependent Chinese Restaurant Process (ddCRP) provides a flexible prior distribution for clustering observations, incorporating covariate information through pairwise distances and accommodating a rich variety of cluster structures. When cluster parameters are conjugate to the likelihood, Bayesian inference is straightforward. In the non-conjugate setting, however, inference becomes substantially more challenging due to the trans-dimensional parameter spaces that arise as cluster assignments change. We develop a reversible jump Markov chain Monte Carlo (RJMCMC) framework to address this challenge, targeting the dimension-changing nature of cluster parameter vectors when observation assignments are updated. We introduce and compare several proposal strategies for birth and death moves, including prior-based, independence, and data-driven moment-matching proposals that target regions of high posterior density. For fixed-dimensional moves, we propose a posterior resampling strategy that improves acceptance rates while maintaining computational efficiency. Through a simulation study and an application to Old Faithful eruption durations, we demonstrate moment-matched proposals offer a principled, data-driven alternative to prior-based proposals. The resulting methodology provides a general RJMCMC framework for ddCRP models with non-conjugate likelihoods, demonstrated here on both discrete and continuous observation models.
翻译:距离相依 Chinese 餐馆过程(ddCRP)为观测数据聚类提供了一种灵活的先验分布,通过成对距离整合协变量信息,并容纳丰富多样的聚类结构。当聚类参数与似然函数共轭时,贝叶斯推断较为直接。然而在非共轭设定下,由于聚类分配变化引发的跨维度参数空间问题,推断变得极具挑战性。我们提出可逆跳转马尔可夫链蒙特卡洛(RJMCMC)框架应对该挑战,聚焦于观测分配更新时聚类参数向量的维度变化特性。我们引入并比较了多种出生与死亡移动的提议策略,包括基于先验、独立性以及数据驱动的矩匹配提议,这些方法聚焦于高后验密度区域。针对固定维度移动,我们提出后验重采样策略,可在维持计算效率的同时提高接受率。通过仿真实验和 Old Faithful 喷发时长数据的应用,我们证明矩匹配提议为基于先验的提议提供了原则性且数据驱动的替代方案。由此形成的方法论为具有非共轭似然函数的 ddCRP 模型构建了通用 RJMCMC 框架,并在离散与连续观测模型上进行了验证。