The quantile varying coefficient (VC) model can flexibly capture dynamical patterns of regression coefficients. In addition, due to the quantile check loss function, it is robust against outliers and heavy-tailed distributions of the response variable, and can provide a more comprehensive picture of modeling via exploring the conditional quantiles of the response variable. Although extensive studies have been conducted to examine variable selection for the high-dimensional quantile varying coefficient models, the Bayesian analysis has been rarely developed. The Bayesian regularized quantile varying coefficient model has been proposed to incorporate robustness against data heterogeneity while accommodating the non-linear interactions between the effect modifier and predictors. Selecting important varying coefficients can be achieved through Bayesian variable selection. Incorporating the multivariate spike-and-slab priors further improves performance by inducing exact sparsity. The Gibbs sampler has been derived to conduct efficient posterior inference of the sparse Bayesian quantile VC model through Markov chain Monte Carlo (MCMC). The merit of the proposed model in selection and estimation accuracy over the alternatives has been systematically investigated in simulation under specific quantile levels and multiple heavy-tailed model errors. In the case study, the proposed model leads to identification of biologically sensible markers in a non-linear gene-environment interaction study using the NHS data.
翻译:分位数变系数模型能够灵活捕捉回归系数的动态变化模式。此外,由于采用分位数检验损失函数,该模型对响应变量中的异常值和重尾分布具有稳健性,并能通过探索响应变量的条件分位数提供更全面的建模视角。尽管已有大量研究探讨了高维分位数变系数模型的变量选择问题,但贝叶斯分析鲜有发展。本文提出的贝叶斯正则化分位数变系数模型,在兼顾效应修饰因子与预测变量间非线性交互作用的同时,整合了对数据异质性的稳健性。通过贝叶斯变量选择可实现重要变系数的筛选。引入多元尖峰-板型先验后,模型可通过诱导精确稀疏性进一步提升性能。我们推导了吉布斯采样器,通过马尔可夫链蒙特卡洛方法对稀疏贝叶斯分位数变系数模型进行高效后验推断。在特定分位数水平和多种重尾模型误差的模拟研究中,系统验证了该模型在变量选择与估计精度方面相较替代方法的优越性。案例分析中,利用美国护士健康研究数据,该模型成功识别出非线性基因-环境交互作用中具有生物学意义的标记物。