We develop a functional proportional hazards mixture cure (FPHMC) model with scalar and functional covariates measured at the baseline. The mixture cure model, useful in studying populations with a cure fraction of a particular event of interest is extended to functional data. We employ the EM algorithm and develop a semiparametric penalized spline-based approach to estimate the dynamic functional coefficients of the incidence and the latency part. The proposed method is computationally efficient and simultaneously incorporates smoothness in the estimated functional coefficients via roughness penalty. Simulation studies illustrate a satisfactory performance of the proposed method in accurately estimating the model parameters and the baseline survival function. Finally, the clinical potential of the model is demonstrated in two real data examples that incorporate rich high-dimensional biomedical signals as functional covariates measured at the baseline and constitute novel domains to apply cure survival models in contemporary medical situations. In particular, we analyze i) minute-by-minute physical activity data from the National Health and Nutrition Examination Survey (NHANES) 2011-2014 to study the association between diurnal patterns of physical activity at baseline and 9-year mortality while adjusting for other biological factors; ii) the impact of daily functional measures of disease severity collected in the intensive care unit on post ICU recovery and mortality event. Software implementation and illustration of the proposed estimation method is provided in R.
翻译:我们提出了一种结合基线标量协变量与函数协变量的函数比例风险混合治愈(FPHMC)模型。该混合治愈模型适用于研究存在特定事件治愈比例的人群,本文将其推广至函数型数据场景。我们采用EM算法,并开发了一种基于半参数惩罚样条的方法来估计发生率部分与潜伏期部分的动态函数系数。所提出方法计算高效,通过粗糙度惩罚同时将函数系数估计的平滑性纳入考量。模拟研究表明,该方法在精确估计模型参数与基线生存函数方面表现优异。最终,通过两个真实数据案例展示了该模型的临床潜力:这两个案例将高维生物医学信号作为基线函数协变量,开创了当代医学场景中应用治愈生存模型的新领域。具体而言,我们分析了:i)2011-2014年美国国家健康与营养调查(NHANES)的逐分钟体力活动数据,在调整其他生物学因素后研究基线日间体力活动模式与9年死亡率之间的关联;ii)重症监护病房采集的每日疾病严重程度函数指标对ICU后恢复及死亡事件的影响。本文在R语言中提供了所提估计方法的软件实现与示例。