Cardiovascular disease (CVD) cohorts collect data longitudinally to study the association between CVD risk factors and event times. An important area of scientific research is to better understand what features of CVD risk factor trajectories are associated with the disease. We develop methods for feature selection in joint models where feature selection is viewed as a bi-level variable selection problem with multiple features nested within multiple longitudinal risk factors. We modify a previously proposed Bayesian sparse group selection (BSGS) prior, which has not been implemented in joint models until now, to better represent prior beliefs when selecting features both at the group level (longitudinal risk factor) and within group (features of a longitudinal risk factor). One of the advantages of our method over the BSGS method is the ability to account for correlation among the features within a risk factor. As a result, it selects important features similarly, but excludes the unimportant features within risk factors more efficiently than BSGS. We evaluate our prior via simulations and apply our method to data from the Atherosclerosis Risk in Communities (ARIC) study, a population-based, prospective cohort study consisting of over 15,000 men and women aged 45-64, measured at baseline and at six additional times. We evaluate which CVD risk factors and which characteristics of their trajectories (features) are associated with death from CVD. We find that systolic and diastolic blood pressure, glucose, and total cholesterol are important risk factors with different important features associated with CVD death in both men and women.
翻译:心血管疾病(CVD)队列研究通过纵向收集数据来探究CVD风险因素与事件发生时间之间的关联。科学研究的一个重要方向是深入理解CVD风险因素轨迹的哪些特征与疾病发展相关。本文开发了适用于联合模型的特征选择方法,将特征选择视为一个双层变量选择问题——多个特征嵌套于多个纵向风险因素之中。我们改进了一种先前提出的贝叶斯稀疏群组选择(BSGS)先验分布(该先验至今未在联合模型中实现),以更好地表达在群组层面(纵向风险因素)和群组内部(纵向风险因素的特征)进行特征选择时的先验信念。相较于BSGS方法,本方法的优势之一在于能够考虑同一风险因素内部特征之间的相关性。因此,它在选择重要特征方面表现相似,但在排除风险因素内不重要特征时比BSGS更为高效。我们通过模拟实验评估了所提先验的效能,并将该方法应用于社区动脉粥样硬化风险(ARIC)研究的数据。ARIC研究是一项基于人群的前瞻性队列研究,纳入了超过15,000名45-64岁的男性和女性参与者在基线及后续六个时间点的测量数据。我们评估了哪些CVD风险因素及其轨迹特征与CVD死亡相关。研究发现,收缩压和舒张压、血糖以及总胆固醇是重要的风险因素,且在男性和女性中,这些因素与CVD死亡相关的关键特征存在差异。