Adaptive randomized experiments update treatment probabilities as data accrue, but still require an end-of-study interval for the average treatment effect (ATE) at a prespecified horizon. Under adaptive assignment, propensities can keep changing, so the predictable quadratic variation of AIPW/DML score increments may remain random. When no deterministic variance limit exists, Wald statistics normalized by a single long-run variance target can be conditionally miscalibrated given the realized variance regime. We assume no interference, sequential randomization, i.i.d. arrivals, and executed overlap on a prespecified scored set, and we require two auditable pipeline conditions: the platform logs the executed randomization probability for each unit, and the nuisance regressions used to score unit $t$ are constructed predictably from past data only. These conditions make the centered AIPW/DML scores an exact martingale difference sequence. Using self-normalized martingale limit theory, we show that the Studentized statistic, with variance estimated by realized quadratic variation, is asymptotically N(0,1) at the prespecified horizon, even without variance stabilization. Simulations validate the theory and highlight when standard fixed-variance Wald reporting fails.
翻译:自适应随机化实验在数据积累过程中不断更新处理概率,但仍需在预设时域处为平均处理效应(ATE)提供研究结束时的区间估计。在自适应分配下,倾向性可能持续变化,导致AIPW/DML评分增量的可预测二次变差仍保持随机性。当不存在确定性方差极限时,基于单一长期方差目标归一化的Wald统计量在给定已实现方差机制下可能出现条件性校准偏差。我们假设不存在干扰效应、满足序贯随机化、独立同分布个体到达,并在预设评分集上执行重叠性要求,同时提出两项可审计的流程条件:平台记录每个单元的已执行随机化概率,且用于对单元$t$评分的干扰回归仅基于历史数据可预测地构建。这些条件使得中心化的AIPW/DML评分构成精确的鞅差序列。利用自归一化鞅极限理论,我们证明即使没有方差稳定性,采用已实现二次变差估计方差的Student化统计量在预设时域处渐近服从N(0,1)分布。仿真实验验证了理论结果,并揭示了标准固定方差Wald报告方法失效的情形。