Censoring is the central problem in survival analysis where either the time-to-event (for instance, death), or the time-tocensoring (such as loss of follow-up) is observed for each sample. The majority of existing machine learning-based survival analysis methods assume that survival is conditionally independent of censoring given a set of covariates; an assumption that cannot be verified since only marginal distributions is available from the data. The existence of dependent censoring, along with the inherent bias in current estimators has been demonstrated in a variety of applications, accentuating the need for a more nuanced approach. However, existing methods that adjust for dependent censoring require practitioners to specify the ground truth copula. This requirement poses a significant challenge for practical applications, as model misspecification can lead to substantial bias. In this work, we propose a flexible deep learning-based survival analysis method that simultaneously accommodate for dependent censoring and eliminates the requirement for specifying the ground truth copula. We theoretically prove the identifiability of our model under a broad family of copulas and survival distributions. Experiments results from a wide range of datasets demonstrate that our approach successfully discerns the underlying dependency structure and significantly reduces survival estimation bias when compared to existing methods.
翻译:删失是生存分析中的核心问题,其中每个样本仅能观测到事件发生时间(例如死亡)或删失时间(例如失访)。现有大多数基于机器学习的生存分析方法假设在给定一组协变量的条件下,生存时间与删失时间条件独立;这一假设无法从数据中验证,因为数据仅能提供边际分布。依赖删失的存在以及当前估计量的固有偏差已在多种应用中得到证实,凸显了需要更精细方法的必要性。然而,现有调整依赖删失的方法要求实践者指定真实Copula函数。这一要求对实际应用构成了重大挑战,因为模型误设可能导致显著偏差。在本工作中,我们提出了一种灵活的基于深度学习的生存分析方法,该方法同时适应依赖删失并消除了指定真实Copula函数的要求。我们从理论上证明了该模型在广泛的Copula族和生存分布下的可识别性。在多种数据集上的实验结果表明,与现有方法相比,我们的方法能成功识别底层依赖结构,并显著降低生存估计偏差。