Identification and estimation of treatment effects on long-term outcomes in clinical trials with external observational data

In biomedical studies, estimating drug effects on chronic diseases requires a long follow-up period, which is difficult to meet in randomized clinical trials (RCTs). The use of a short-term surrogate to replace the long-term outcome for assessing the drug effect relies on stringent assumptions that empirical studies often fail to satisfy. Motivated by a kidney disease study, we investigate the drug effects on long-term outcomes by combining an RCT without observation of long-term outcome and an observational study in which the long-term outcome is observed but unmeasured confounding may exist. Under a mean exchangeability assumption weaker than the previous literature, we identify the average treatment effects in the RCT and derive the associated efficient influence function and semiparametric efficiency bound. Furthermore, we propose a locally efficient doubly robust estimator and an inverse probability weighted (IPW) estimator. The former attains the semiparametric efficiency bound if all the working models are correctly specified. The latter has a simpler form and requires much fewer model specifications. The IPW estimator using estimated propensity scores is more efficient than that using true propensity scores and achieves the semiparametric efficient bound in the case of discrete covariates and surrogates with finite support. Both estimators are shown to be consistent and asymptotically normally distributed. Extensive simulations are conducted to evaluate the finite-sample performance of the proposed estimators. We apply the proposed methods to estimate the efficacy of oral hydroxychloroquine on renal failure in a real-world data analysis.

翻译：在生物医学研究中，评估药物对慢性疾病的效果需要长时间的随访，这在随机临床试验（RCTs）中难以实现。使用短期替代指标替代长期结果来评估药物效果，依赖于严格假设，而实证研究往往无法满足这些假设。受一项肾脏疾病研究的启发，我们通过结合未观察长期结果的RCT和观察了长期结果但可能存在未测量混杂因素的观察性研究，探讨药物对长期结果的影响。在比以往文献更弱的均值可交换性假设下，我们识别了RCT中的平均治疗效果，并推导了相关的有效影响函数和半参数效率界。此外，我们提出了局部有效的双重稳健估计量和逆概率加权（IPW）估计量。前者在所有工作模型均正确设定时达到半参数效率界，后者形式更简单且所需模型设定更少。使用估计倾向得分的IPW估计量比使用真实倾向得分的IPW估计量更有效，并在离散协变量和有限支撑替代指标的情况下达到半参数效率界。两种估计量均被证明具有一致性和渐近正态性。我们进行了大量模拟实验以评估所提估计量的有限样本性能。最后，我们将所提方法应用于真实世界数据分析，评估口服羟氯喹对肾衰竭的疗效。