The quantile varying coefficient (VC) model can flexibly capture dynamical patterns of regression coefficients. In addition, due to the quantile check loss function, it is robust against outliers and heavy-tailed distributions of the response variable, and can provide a more comprehensive picture of modeling via exploring the conditional quantiles of the response variable. Although extensive studies have been conducted to examine variable selection for the high-dimensional quantile varying coefficient models, the Bayesian analysis has been rarely developed. The Bayesian regularized quantile varying coefficient model has been proposed to incorporate robustness against data heterogeneity while accommodating the non-linear interactions between the effect modifier and predictors. Selecting important varying coefficients can be achieved through Bayesian variable selection. Incorporating the multivariate spike-and-slab priors further improves performance by inducing exact sparsity. The Gibbs sampler has been derived to conduct efficient posterior inference of the sparse Bayesian quantile VC model through Markov chain Monte Carlo (MCMC). The merit of the proposed model in selection and estimation accuracy over the alternatives has been systematically investigated in simulation under specific quantile levels and multiple heavy-tailed model errors. In the case study, the proposed model leads to identification of biologically sensible markers in a non-linear gene-environment interaction study using the NHS data.
翻译:分位数变系数模型能够灵活捕捉回归系数的动态变化模式。此外,由于分位数检验损失函数,该模型对响应变量中的异常值和重尾分布具有稳健性,并能通过探索响应变量的条件分位数提供更全面的建模视角。尽管已有大量研究探讨高维分位数变系数模型的变量选择问题,但其贝叶斯分析却鲜有发展。本文提出贝叶斯正则化分位数变系数模型,在容纳效应修饰变量与预测变量间非线性交互作用的同时,融合了对数据异质性的稳健性。通过贝叶斯变量选择可实现重要变系数的筛选。引入多元尖峰-板状先验进一步通过诱导精确稀疏性提升模型性能。推导了吉布斯采样器,通过马尔可夫链蒙特卡洛方法对稀疏贝叶斯分位数变系数模型进行高效后验推断。在特定分位水平及多种重尾模型误差下,通过模拟实验系统研究了所提模型在变量选择与估计精度方面相较于替代模型的优势。案例研究中,利用美国护士健康研究数据,该模型在一项非线性基因-环境交互作用研究中识别出具有生物学意义的标记物。