Randomized controlled trials (RCTs) often suffer from limited inferential efficiency in estimating treatment effects due to their small sample sizes. In recent years, incorporating external controls (ECs) has gained increasing attention as an effective way to augment small RCTs and thereby enhance estimation efficiency. However, ECs are not always comparable to RCTs, and direct borrowing without careful evaluation can introduce substantial bias and, paradoxically, undermine the accuracy of treatment effect estimation. In this paper, we propose a novel adaptive influence-based sample borrowing framework to improve average treatment effect (ATE) estimation in RCTs. The framework quantifies the ``comparability'' of each sample in ECs using influence functions and identifies the optimal subset of ECs that minimizes the mean squared error of the ATE estimator. The proposed framework is assumption-lean regarding the distribution of ECs and is robust to outliers, making it broadly applicable across diverse settings. Moreover, we develop an outcome calibration method to improve the data utilization efficiency of ECs, further strengthening the adaptive influence-based sample-borrowing framework. We demonstrate the effectiveness of the proposed method using both simulated and real-world datasets.
翻译:随机对照试验(RCTs)常因样本量较小而在估计治疗效果时面临推断效率受限的问题。近年来,纳入外部对照(ECs)作为增强小规模RCTs并提升估计效率的有效手段日益受到关注。然而,ECs与RCTs并不总是具有可比性,未经审慎评估的直接借用可能引入显著偏倚,甚至适得其反地损害治疗效果估计的准确性。本文提出一种新颖的自适应影响性样本借用框架,用于改善RCTs中平均处理效应(ATE)的估计。该框架利用影响函数量化ECs中每个样本的"可比性",并识别出能使ATE估计量均方误差最小化的最优ECs子集。所提框架对ECs的分布假设要求极低,且对异常值具有鲁棒性,因而可广泛适用于多种场景。此外,我们开发了一种结果校准方法以提升ECs的数据利用效率,进一步强化了自适应影响性样本借用框架。通过模拟数据集和真实世界数据集验证了所提方法的有效性。