We consider the problem of efficient inference of the Average Treatment Effect in a sequential experiment where the policy governing the assignment of subjects to treatment or control can change over time. We first provide a central limit theorem for the Adaptive Augmented Inverse-Probability Weighted estimator, which is semiparametric efficient, under weaker assumptions than those previously made in the literature. This central limit theorem enables efficient inference at fixed sample sizes. We then consider a sequential inference setting, deriving both asymptotic and nonasymptotic confidence sequences that are considerably tighter than previous methods. These anytime-valid methods enable inference under data-dependent stopping times (sample sizes). Additionally, we use propensity score truncation techniques from the recent off-policy estimation literature to reduce the finite sample variance of our estimator without affecting the asymptotic variance. Empirical results demonstrate that our methods yield narrower confidence sequences than those previously developed in the literature while maintaining time-uniform error control.
翻译:我们考虑在序贯实验中平均处理效应的高效推断问题,其中受试者被分配至处理组或对照组的策略可随时间变化。首先,我们在比现有文献更弱的假设条件下,证明了自适应增强逆概率加权估计量的中心极限定理,该估计量具有半参数有效性。该中心极限定理可在固定样本量下实现高效推断。随后,我们考虑序贯推断场景,推导出比现有方法显著更紧致的渐近与非渐近置信序列。这些随时有效的方法支持依赖于数据的停止时间(样本量)进行推断。此外,我们采用近期离策略估计文献中的倾向得分截断技术,在不影响渐近方差的前提下降低估计量的有限样本方差。实验结果表明,我们的方法在保持时间均匀误差控制的同时,生成的置信序列比现有文献更窄。