The promotion time cure rate model (PCM) is an extensively studied model for the analysis of time-to-event data in the presence of a cured subgroup. There are several strategies proposed in the literature to model the latency part of PCM. However, there aren't many strategies proposed to investigate the effects of covariates on the incidence part of PCM. In this regard, most existing studies assume the boundary separating the cured and non-cured subjects with respect to the covariates to be linear. As such, they can only capture simple effects of the covariates on the cured/non-cured probability. In this manuscript, we propose a new promotion time cure model that uses the support vector machine (SVM) to model the incidence part. The proposed model inherits the features of the SVM and provides flexibility in capturing non-linearity in the data. To the best of our knowledge, this is the first work that integrates the SVM with PCM model. For the estimation of model parameters, we develop an expectation maximization algorithm where we make use of the sequential minimal optimization technique together with the Platt scaling method to obtain the posterior probabilities of cured/uncured. A detailed simulation study shows that the proposed model outperforms the existing logistic regression-based PCM model as well as the spline regression-based PCM model, which is also known to capture non linearity in the data. This is true in terms of bias and mean square error of different quantities of interest, and also in terms of predictive and classification accuracies of cure. Finally, we illustrate the applicability and superiority of our model using the data from a study on leukemia patients who went through bone marrow transplantation.
翻译:促进时间治愈率模型(PCM)是分析存在治愈亚组的事件时间数据的广泛研究模型。现有文献提出了多种策略对PCM的潜伏期部分进行建模,但研究协变量对PCM发病率部分影响的策略相对较少。在此方面,大多数现有研究假设基于协变量划分治愈与非治愈受试者的边界是线性的,因此只能捕捉协变量对治愈/非治愈概率的简单效应。本文提出一种采用支持向量机(SVM)对发病率部分进行建模的新型促进时间治愈模型。所提模型继承了SVM的特性,能够灵活捕捉数据中的非线性关系。据我们所知,这是首个将SVM与PCM模型相结合的工作。在模型参数估计方面,我们开发了一种期望最大化算法,该算法利用序列最小优化技术与Platt缩放方法获取治愈/非治愈的后验概率。详细的仿真研究表明,所提模型在各项指标的偏差和均方误差方面,以及在治愈的预测精度和分类精度上,均优于现有的基于逻辑回归的PCM模型和同样能捕捉数据非线性的样条回归PCM模型。最后,我们利用一项骨髓移植白血病患者的研究数据,展示了所提模型的适用性和优越性。