Enriched Dirichlet process mixture (EDPM) models are Bayesian nonparametric models which can be used for nonparametric regression and conditional density estimation and which overcome a key disadvantage of jointly modeling the response and predictors as a Dirichlet process mixture (DPM) model: when there is a large number of predictors, the clusters induced by the DPM will be overwhelmingly determined by the predictors rather than the response. A truncation approximation to a DPM allows a blocked Gibbs sampling algorithm to be used rather than a Polya urn sampling algorithm. The blocked Gibbs sampler offers potential improvement in mixing. The truncation approximation also allows for implementation in standard software ($\textit{rjags}$ and $\textit{rstan}$). In this paper we introduce an analogous truncation approximation for an EDPM. We show that with sufficiently large truncation values in the approximation of the EDP prior, a precise approximation to the EDP is available. We verify that the truncation approximation and blocked Gibbs sampler with minimum truncation values that obtain adequate error bounds achieve similar accuracy to the truncation approximation and blocked Gibbs sampler with large truncation values using a simulated example. Further, we use the simulated example to show that the blocked Gibbs sampler improves upon the mixing in the Polya urn sampler, especially as the number of covariates increases.
翻译:富化狄利克雷过程混合(EDPM)模型是一种贝叶斯非参数模型,可用于非参数回归和条件密度估计,并克服了将响应变量与预测变量联合建模为狄利克雷过程混合(DPM)模型的一个关键缺陷:当预测变量数量众多时,DPM所产生的聚类将主要由预测变量而非响应变量决定。对DPM进行截断近似使得能够采用分块吉布斯采样算法,而非波利亚瓮采样算法。分块吉布斯采样器在混合性方面具有潜在改善,同时截断近似还便于在标准软件($\textit{rjags}$和$\textit{rstan}$)中实现。本文针对EDPM引入了类似的截断近似。我们证明,当EDP先验近似中的截断值足够大时,可获得对EDP的精确近似。通过模拟示例验证,采用能够满足误差界限的最小截断值进行截断近似与分块吉布斯采样,其精度与使用较大截断值的截断近似及分块吉布斯采样相当。此外,我们利用该模拟示例表明,分块吉布斯采样器改善了波利亚瓮采样器的混合性能,尤其当协变量数量增加时效果更为显著。