In the context of right-censored data, we study the problem of predicting the restricted time to event based on a set of covariates. Under a quadratic loss, this problem is equivalent to estimating the conditional Restricted Mean Survival Time (RMST). To that aim, we propose a flexible and easy-to-use ensemble algorithm that combines pseudo-observations and super learner. The classical theoretical results of the super learner are extended to right-censored data, using a new definition of pseudo-observations, the so-called split pseudo-observations. Simulation studies indicate that the split pseudo-observations and the standard pseudo-observations are similar even for small sample sizes. The method is applied to maintenance and colon cancer datasets, showing the interest of the method in practice, as compared to other prediction methods. We complement the predictions obtained from our method with our RMST-adapted risk measure, prediction intervals and variable importance measures developed in a previous work.
翻译:在右删失数据背景下,我们研究了基于协变量预测受限事件时间的问题。在二次损失函数下,该问题等价于估计条件受限平均生存时间。为此,我们提出了一种灵活易用的集成算法,该算法结合了伪观测与超级学习器。通过引入一种新定义的伪观测——分裂伪观测,我们将超级学习器的经典理论结果扩展至右删失数据。模拟研究表明,即使样本量较小,分裂伪观测与标准伪观测的表现也高度相似。该方法被应用于设备维护和结肠癌数据集,展示了其相较于其他预测方法在实际应用中的优势。我们通过先前工作中开发的自适应RMST风险度量、预测区间和变量重要性度量,对本方法获得的预测结果进行了补充。