Adjusting for covariates in randomized controlled trials can enhance the credibility and efficiency of treatment effect estimation. However, handling numerous covariates and their non-linear transformations is challenging, particularly when outcomes and covariates have missing data. In this tutorial, we propose a principled Covariate Adjustment with Variable Selection and Missing Data Imputation (COADVISE) framework that enables (i) variable selection for covariates most relevant to the outcome, (ii) nonlinear adjustments, and (iii) robust imputation of missing data for both outcomes and covariates. This framework ensures consistent estimates with improved efficiency over unadjusted estimators and provides robust variance estimation, even under outcome model misspecification. We demonstrate efficiency gains through theoretical analysis and conduct extensive simulations to compare alternative variable selection strategies, offering cautionary recommendations. We showcase the practical utility of COADVISE by applying it to the Best Apnea Interventions for Research trial data from the National Sleep Research Resource. A user-friendly R package, Coadvise, facilitates implementation.
翻译:在随机对照试验中进行协变量调整,能够提升处理效应估计的可信度与效率。然而,处理大量协变量及其非线性变换颇具挑战,尤其是在结局变量和协变量存在缺失数据时。本教程提出一个原则性的协变量调整框架——结合变量选择与缺失数据插补的协变量调整(COADVISE),该框架能够实现:(i)筛选与结局最相关的协变量;(ii)进行非线性调整;(iii)对结局和协变量的缺失数据进行稳健插补。该框架保证了估计的一致性,其效率优于未调整的估计量,并提供了稳健的方差估计,即使在结局模型设定错误的情况下亦然。我们通过理论分析证明了其效率提升,并进行了大量模拟以比较不同的变量选择策略,给出了审慎的建议。通过将COADVISE应用于国家睡眠研究资源库的“最佳呼吸暂停干预研究”试验数据,我们展示了其实际效用。用户友好的R软件包Coadvise便于该方法的实现。