Modeling symptom progression to identify informative subjects for a new Huntington's disease clinical trial is problematic since time to diagnosis, a key covariate, can be heavily censored. Imputation is an appealing strategy where censored covariates are replaced with their conditional means, but existing methods saw over 200% bias under heavy censoring. Calculating these conditional means well requires estimating and then integrating over the survival function of the censored covariate from the censored value to infinity. To flexibly estimate the survival function, existing methods use the semiparametric Cox model with Breslow's estimator. Then, for integration, the trapezoidal rule is used, but the trapezoidal rule is not designed for improper integrals and leads to bias. We propose calculating the conditional mean with adaptive quadrature instead, which can handle the improper integral. Yet, even with adaptive quadrature, the integrand (the survival function) is undefined beyond the observed data, so we identify the "Weibull extension" as the best method to extrapolate and then integrate. In simulation studies, we show that replacing the trapezoidal rule with adaptive quadrature and adopting the Weibull extension corrects the bias seen with existing methods. We further show how imputing with corrected conditional means helps to prioritize patients for future clinical trials.
翻译:为了识别亨廷顿舞蹈症新临床试验中有信息量的受试者,对症状进展进行建模存在困难,因为关键协变量"诊断时间"可能存在严重删失。插补是一种有吸引力的策略,即用条件均值替代删失协变量,但现有方法在高度删失下偏差可超过200%。要准确计算这些条件均值,需要先估计删失协变量的生存函数,然后对其从删失值到无穷大进行积分。为灵活估计生存函数,现有方法采用带Breslow估计量的半参数Cox模型。在积分环节,传统方法使用梯形法则,但该法则不适用于反常积分,因而导致偏差。本文提出改用自适应求积法计算条件均值,该方法能处理反常积分。然而即便采用自适应求积,被积函数(生存函数)在观测数据范围外仍无法定义,因此我们识别出"威布尔扩展法"是外推并积分的最佳方案。仿真研究表明,用自适应求积替代梯形法则并采用威布尔扩展法,能纠正现有方法存在的偏差。我们进一步展示了如何利用修正后的条件均值插补,帮助优先筛选未来临床试验的患者。