As more data being collected nowadays, it is common to analyze multiple related responses from the same study. Existing variable selection methods select variables for all responses without considering that some features may only predict a subset of responses but not the rest. Motivated by the multi-trait fine mapping problem in genetics, we develop a novel Bayesian indicator variable selection method with a large number of grouped predictors targeting at multiple correlated and possibly heterogeneous responses. We showed the advantage of our method via extensive simulations and a fine mapping example to identify causal variants associated with multiple addictive behaviors.
翻译:随着现今收集的数据日益增多,分析同一研究中多个相关响应变量已成为常见需求。现有变量选择方法对所有响应变量统一选取预测因子,未能考虑某些特征可能仅预测部分响应变量而非全部。受遗传学中多性状精细定位问题的启发,我们提出了一种新型贝叶斯指标变量选择方法,适用于存在大量分组预测因子且目标响应变量具有相关性与潜在异质性的场景。通过大规模模拟实验及一项针对多种成瘾行为相关因果变异的精细定位实例,我们展示了该方法相较于传统方法的显著优势。