Surrogate-Assisted Targeted Learning for Nested Bridge Functionals under Administrative Censoring

Delayed primary outcomes and administratively censored follow-up create a general semiparametric estimation problem: the target causal functional depends on an endpoint observed only for a shrinking subset of units at analysis time, while earlier surrogate measurements remain widely available. In such settings, inverse-probabilityweighted estimators can become unstable as observation probabilities approach the positivity boundary, and complete-case model-based analyses can be highly sensitive to outcome-model specification. We develop a surrogate-assisted targeted minimum loss estimator for this nested causal functional. Identification proceeds through a surrogate-bridge representation that integrates an observed-outcome regression over the conditional surrogate distribution, thereby avoiding inverse observation weights in the target parameter itself. We show that the estimator is asymptotically linear and doubly robust (in the sense that first-order bias vanishes when either nuisance component is consistently estimated), and we characterize two structural features of the problem: under surrogate-mediated missing at random, the censoring mechanism contributes no separate tangent-space component to the efficient influence function; and for nested bridge functionals, a one-step debiased machine-learning construction leaves a second-order cross-product remainder involving the conditional surrogate law. The proposed two-stage targeting step removes this term without requiring direct estimation of that law. Simulation studies demonstrate stable finite-sample performance under substantial administrative censoring, and a design-calibrated analysis based on the Washington State EPT study illustrates the method in a realistic stepped-wedge cluster-randomized setting.

翻译：延迟的主要结局和行政删失的随访构成了一个广义的半参数估计问题：目标因果泛函依赖于仅在分析时对逐渐缩小的个体子集观测到的终点，而较早的代理测量值仍广泛可得。在此类情形下，逆概率加权估计量可能随着观测概率趋近于正性边界而变得不稳定，而基于完全案例模型的估计则对结局模型设定高度敏感。我们为此嵌套因果泛函开发了一种代理辅助的靶向最小损失估计量。识别过程通过一个代理桥表示进行，该表示将观测结局回归对条件代理分布进行积分，从而避免了目标参数本身中的逆观测权重。我们证明了该估计量是渐近线性且双重稳健的（在任一干扰成分被一致估计时一阶偏倚消失的意义上），并刻画了该问题的两个结构特征：在代理介导的随机缺失下，删失机制对有效影响函数不贡献独立的切空间分量；对于嵌套桥泛函，一步去偏机器学习构造会留下一个涉及条件代理分布律的二阶交叉乘积余项。所提出的两阶段靶向步骤移除了此项，且无需直接估计该分布律。模拟研究展示了在显著行政删失下稳定的有限样本性能，基于华盛顿州EPT研究的设计校准分析则在一个现实的阶梯式楔形整群随机化场景中说明了该方法。