In multi-state models based on high-dimensional data, effective modeling strategies are required to determine an optimal, ideally parsimonious model. In particular, linking covariate effects across transitions is needed to conduct joint variable selection. A useful technique to reduce model complexity is to address homogeneous covariate effects for distinct transitions. We integrate this approach to data-driven variable selection by extended regularization methods within multi-state model building. We propose the fused sparse-group lasso (FSGL) penalized Cox-type regression in the framework of multi-state models combining the penalization concepts of pairwise differences of covariate effects along with transition grouping. For optimization, we adapt the alternating direction method of multipliers (ADMM) algorithm to transition-specific hazards regression in the multi-state setting. In a simulation study and application to acute myeloid leukemia (AML) data, we evaluate the algorithm's ability to select a sparse model incorporating relevant transition-specific effects and similar cross-transition effects. We investigate settings in which the combined penalty is beneficial compared to global lasso regularization.
翻译:在高维数据驱动的多状态模型中,需要有效的建模策略来确定最优且理想上简约的模型。特别地,需要跨转移关联协变量效应以进行联合变量选择。降低模型复杂度的一种有效技术是处理不同转移间同质的协变量效应。我们将这一方法通过扩展的正则化方法整合到数据驱动的变量选择中,并应用于多状态模型的构建。我们提出了在多状态模型框架下结合协变量效应成对差异惩罚与转移分组惩罚概念的融合稀疏组套索(FSGL)惩罚Cox型回归。在优化方面,我们将交替方向乘子法(ADMM)算法适配于多状态设定下的转移特异性风险回归。通过模拟研究及在急性髓系白血病(AML)数据上的应用,我们评估了该算法在纳入相关转移特异性效应及相似跨转移效应时选择稀疏模型的能力。我们探究了相较于全局套索正则化,组合惩罚在何种设定下更具优势。