In many clinical contexts, estimating effects of treatment in time-to-event data is complicated not only by confounding, censoring, and heterogeneity, but also by the presence of a cured subpopulation in which the event of interest never occurs. In such settings, treatment may have distinct effects on (1) the probability of being cured and (2) the event timing among non-cured individuals. Standard survival analysis and causal inference methods typically do not separate cured from non-cured individuals, obscuring distinct treatment mechanisms on cure probability and event timing. To address these challenges, we propose a matching-based framework that constructs distinct match groups to estimate heterogeneous treatment effects (HTE) on cure probability and event timing, respectively. We use mixture cure models to identify feature importance for both estimands, which in turn informs weighted distance metrics for matching in high-dimensional spaces. Within matched groups, Kaplan-Meier estimators provide estimates of cure probability and expected time to event, from which individual-level treatment effects are derived. We provide theoretical guarantees for estimator consistency and distance metric optimality under an equal-scale constraint. We further decompose estimation error into contributions from censoring, model fitting, and irreducible noise. Simulations and real-world data analyses demonstrate that our approach delivers interpretable and robust HTE estimates in time-to-event settings.
翻译:在许多临床情境中,估计治疗对时间-事件数据的影响不仅因混杂、删失和异质性而复杂化,还因存在"治愈"亚群(即感兴趣事件永不发生的个体)而进一步复杂。在此类设定中,治疗可能对以下两方面产生不同影响:(1) 被治愈的概率;(2) 未治愈个体的事件发生时间。标准的生存分析与因果推断方法通常无法区分治愈与未治愈个体,从而模糊了治疗对治愈概率和事件时间的异质作用机制。为应对这些挑战,我们提出一种基于匹配的框架,通过构建不同的匹配组分别估计治疗对治愈概率和事件时间的异质处理效应。我们采用混合治愈模型识别两个估计目标的特征重要性,进而构建高维空间匹配的加权距离度量。在匹配组内,Kaplan-Meier估计量可提供治愈概率和预期事件时间的估计值,并由此推导个体层面的处理效应。我们在等尺度约束下为估计量的一致性和距离度量的最优性提供了理论保证。进一步将估计误差分解为删失、模型拟合和不可约噪声的贡献。仿真与真实数据分析表明,本方法能够在时间-事件场景中提供可解释且稳健的异质处理效应估计。