The construction of coherent prediction models holds great importance in medical research as such models enable health researchers to gain deeper insights into disease epidemiology and clinicians to identify patients at higher risk of adverse outcomes. One commonly employed approach to developing prediction models is variable selection through penalized regression techniques. Integrating natural variable structures into this process not only enhances model interpretability but can also %increase the likelihood of recovering the true underlying model and boost prediction accuracy. However, a challenge lies in determining how to effectively integrate potentially complex selection dependencies into the penalized regression. In this work, we demonstrate how to represent selection dependencies mathematically, provide algorithms for deriving the complete set of potential models, and offer a structured approach for integrating complex rules into variable selection through the latent overlapping group Lasso. To illustrate our methodology, we applied these techniques to construct a coherent prediction model for major bleeding in hypertensive patients recently hospitalized for atrial fibrillation and subsequently prescribed oral anticoagulants. In this application, we account for a proxy of anticoagulant adherence and its interaction with dosage and the type of oral anticoagulants in addition to drug-drug interactions.
翻译:构建连贯预测模型在医学研究中具有重要意义,因为此类模型使健康研究人员能够更深入地了解疾病流行病学,并使临床医生能够识别出不良结局风险较高的患者。一种常用的预测模型开发方法是通过惩罚回归技术进行变量选择。将自然变量结构整合到这一过程中,不仅能增强模型的可解释性,还能提高恢复真实潜在模型的可能性并提升预测精度。然而,一个挑战在于如何有效整合可能复杂的选择依赖关系进入惩罚回归。在本工作中,我们展示了如何数学上表示选择依赖关系,提供了推导完整潜在模型集的算法,并给出了通过潜在重叠组Lasso将复杂规则整合到变量选择中的结构化方法。为阐明我们的方法论,我们应用这些技术构建了一个连贯预测模型,用于预测近期因房颤住院并随后处方口服抗凝剂的高血压患者发生大出血的风险。在此应用中,我们考虑了抗凝剂依从性的代理变量及其与剂量和口服抗凝剂类型的交互作用,以及药物-药物交互作用。