Evaluating treatment effects is critical in clinical trials but sometimes involves lengthy, invasive, or costly follow-up procedures. In these cases, surrogate markers, which provide intermediate measures of the long-term treatment effect, allow clinicians to obtain results faster and more efficiently than would have otherwise been possible. Prior to adoption, it is vital that the utility of surrogate markers (i.e., their ability to capture the treatment effect on the primary outcome) is statistically validated. Many frameworks for evaluating surrogate markers have been proposed, but they do not account for missing data. Instead, they rely on complete cases (the subset of patients without missing data), which can be inefficient and biased. To improve on this, we propose methods to accommodate missing data in nonparametric and parametric surrogate evaluation via inverse probability weighting (IPW) and semiparametric maximum likelihood estimation (SMLE). Through simulation studies, we demonstrate that the proposed methods remain unbiased under a broader range of missing data mechanisms than complete case analysis and can help retain the statistical precision of the full trial. We illustrate their practical utility through an application to a diabetes clinical trial. Moreover, our missing data corrections have complementary strengths with respect to computational ease, robustness, and statistical efficiency. All methods are implemented in the MissSurrogate R package.
翻译:评估治疗效果在临床试验中至关重要,但有时涉及冗长、侵入性或昂贵的随访程序。在这些情况下,替代标志物(提供长期治疗效果的中间测量指标)使临床医生能够比传统方法更快、更高效地获取结果。在采用替代标志物之前,必须对其效用(即捕捉主要结局治疗效果的能力)进行统计验证。目前已提出多种评估替代标志物的框架,但均未考虑缺失数据问题。这些方法通常依赖完整病例(无缺失数据的患者子集),这可能导致效率低下和偏倚。为改进这一状况,我们提出通过逆概率加权(IPW)和半参数极大似然估计(SMLE)在非参数与参数替代标志物评估中处理缺失数据的方法。模拟研究表明,与完整病例分析相比,所提方法在更广泛的缺失数据机制下仍能保持无偏性,并有助于保留完整试验的统计精度。我们通过一项糖尿病临床试验的应用实例展示了其实用价值。此外,我们的缺失数据校正方法在计算简便性、稳健性和统计效率方面具有互补优势。所有方法均已集成在MissSurrogate R语言包中。