The conditional survival function of a time-to-event outcome subject to censoring and truncation is a common target of estimation in survival analysis. This parameter may be of scientific interest and also often appears as a nuisance in nonparametric and semiparametric problems. In addition to classical parametric and semiparametric methods (e.g., based on the Cox proportional hazards model), flexible machine learning approaches have been developed to estimate the conditional survival function. However, many of these methods are either implicitly or explicitly targeted toward risk stratification rather than overall survival function estimation. Others apply only to discrete-time settings or require inverse probability of censoring weights, which can be as difficult to estimate as the outcome survival function itself. Here, we employ a decomposition of the conditional survival function in terms of observable regression models in which censoring and truncation play no role. This allows application of an array of flexible regression and classification methods rather than only approaches that explicitly handle the complexities inherent to survival data. We outline estimation procedures based on this decomposition, empirically assess their performance, and demonstrate their use on data from an HIV vaccine trial.
翻译:在删失和截断条件下,时间至事件结果的条件生存函数是生存分析中常见的估计目标。该参数可能具有科学意义,也常作为非参数和半参数问题中的冗余参数出现。除经典参数及半参数方法(例如基于Cox比例风险模型的方法)外,灵活型机器学习方法已被开发用于估计条件生存函数。然而,许多此类方法或显式或隐式地以风险分层为目标,而非整体生存函数估计;另一些方法仅适用于离散时间设置,或需使用逆删失概率权重——而后者的估计难度可能不亚于结果生存函数本身。本文利用条件生存函数关于可观测回归模型的分解方法,使删失与截断不再发挥作用,从而允许应用一系列灵活的回归与分类方法,而无需局限于专门处理生存数据固有复杂性的方法。我们基于该分解方法概述了估计流程,通过实证评估其性能,并展示了其在HIV疫苗试验数据中的应用。