Modern Causal Inference Approaches to Improve Power for Subgroup Analysis in Randomized Controlled Trials

Randomized controlled trials (RCTs) often include subgroup analyses to assess whether treatment effects vary across pre-specified patient populations. However, these analyses frequently suffer from small sample sizes which limit the power to detect heterogeneous effects. Power can be improved by leveraging predictors of the outcome -- i.e., through covariate adjustment -- as well as by borrowing external data from similar RCTs or observational studies. The benefits of covariate adjustment may be limited when the trial sample is small. Borrowing external data can increase the effective sample size and improve power, but it introduces two key challenges: (i) integrating data across sources can lead to model misspecification, and (ii) practical violations of the positivity assumption -- where the probability of receiving the target treatment is near-zero for some covariate profiles in the external data -- can lead to extreme inverse-probability weights and unstable inferences, ultimately negating potential power gains. To account for these shortcomings, we present an approach to improving power in pre-planned subgroup analyses of small RCTs that leverages both baseline predictors and external data. We propose debiased estimators that accommodate parametric, machine learning, and nonparametric Bayesian methods. To address practical positivity violations, we introduce three estimators: a covariate-balancing approach, an automated debiased machine learning (DML) estimator, and a calibrated DML estimator. We show improved power in various simulations and offer practical recommendations for the application of the proposed methods. Finally, we apply them to evaluate the effectiveness of citalopram for negative symptoms in first-episode schizophrenia patients across subgroups defined by duration of untreated psychosis, using data from two small RCTs.

翻译：随机对照试验常包含亚组分析，以评估治疗效果是否在预先指定的患者群体中存在异质性。然而，这些分析常因样本量较小而受限，导致检测异质效应的功效不足。功效可通过利用结局预测因子（即协变量调整）以及从类似随机对照试验或观察性研究中借用外部数据来提升。当试验样本较小时，协变量调整的益处可能有限。借用外部数据可增加有效样本量并提升功效，但会引入两个关键挑战：(i)跨数据源整合可能导致模型设定错误；(ii)实际违背正性假设——即外部数据中某些协变量特征接受目标治疗的概率接近零——可能导致极端逆概率权重与不稳定推断，最终抵消潜在的功效增益。为应对这些缺陷，我们提出一种提升小型随机对照试验预计划亚组分析功效的方法，该方法同时利用基线预测因子与外部数据。我们提出了可兼容参数化、机器学习及非参数贝叶斯方法的去偏估计量。针对实际正性违背问题，我们引入了三种估计量：协变量平衡方法、自动化去偏机器学习估计量及校准化去偏机器学习估计量。我们在多种模拟中展示了功效提升效果，并为所提方法的应用提供实践建议。最后，我们将其应用于评估西酞普兰对首发精神分裂症患者阴性症状的疗效，基于两个小型随机对照试验数据，按未治疗精神病持续时间定义的亚组进行分析。