Variable fusion in linear regression models is a statistical method that identifies covariates making similar contributions to the response variable and imposes the same coefficient values on them. Many methods for variable fusion also incorporate variable selection for practical reasons. In this paper, within the Bayesian model averaging (BMA) framework, we propose a spike-and-slab-based Bayesian method that performs both variable fusion and selection. This is challenging in the BMA framework because one must construct a discrete model space that accommodates both selection and fusion and assign suitable priors over that space. In the proposed method, we present a way to explore a model space for variable fusion and selection based on Gibbs sampling by devising a prior distribution for latent variables representing the model. Furthermore, among non-local priors with superior model selection properties, we construct a prior tailored for variable fusion and use it as the slab distribution. We examine the effectiveness of the proposed method through theoretical and empirical studies.
翻译:在线性回归模型中,变量融合是一种统计方法,用于识别对响应变量具有相似贡献的协变量,并对其施加相同的系数值。许多变量融合方法在实际应用中还整合了变量选择功能。本文在贝叶斯模型平均框架下,提出了一种基于尖峰-平板先验的贝叶斯方法,能够同时实现变量融合与选择。在贝叶斯模型平均框架中实现这一目标具有挑战性,因为需要构建一个同时容纳选择与融合的离散模型空间,并在该空间上设定合适的先验分布。在所提方法中,我们通过为表示模型的潜变量设计先验分布,提出了一种基于吉布斯采样的变量融合与选择模型空间探索方法。此外,在具有优越模型选择特性的非局部先验中,我们构建了专门适用于变量融合的先验分布,并将其用作平板分布。通过理论分析与实证研究,我们验证了所提方法的有效性。