We consider the problem of Adaptive Neyman Allocation for the class of AIPW estimators in a design-based setting, where potential outcomes and covariates are deterministic. As each subject arrives, an adaptive procedure must select both a treatment assignment probability and a pair of linear predictors to be used in the AIPW estimator. Our goal is to construct an adaptive procedure that minimizes the Neyman Regret, which is the difference between the variance of the adaptive procedure and an oracle variance which uses the optimal non-adaptive choice of assignment probabilities and linear predictors. While previous work has drawn insightful connections between Neyman Regret and online convex optimization for the Horvitz--Thompson estimator, one of the central challenges for the AIPW estimator is that the underlying optimization is non-convex. In this paper, we propose Sigmoid-FTRL, an adaptive experimental design which addresses the non-convexity via simultaneous minimization of two convex regrets. We prove that under standard regularity conditions, the Neyman Regret of Sigmoid-FTRL converges at a $T^{-1/2} R$ rate, where $T$ is the number of subjects in the experiment and $R$ is the maximum norm of covariate vectors. Moreover, we show that no adaptive design can improve upon the $T^{-1/2} R$ rate under our regularity conditions, establishing the minimax rate of Neyman Regret. Finally, we establish a central limit theorem and a consistently conservative variance estimator which facilitate the construction of asymptotically valid Wald-type confidence intervals.
翻译:我们考虑在基于设计的框架下,针对AIPW估计量类别的自适应奈曼分配问题,其中潜在结果和协变量是确定性的。随着每个受试者的到来,自适应程序必须同时选择一个治疗分配概率和一对用于AIPW估计量的线性预测器。我们的目标是构建一个自适应程序,以最小化奈曼遗憾,即自适应程序的方差与使用最优非自适应分配概率和线性预测器的预言机方差之间的差值。尽管先前的研究已在霍维茨-汤普森估计量的奈曼遗憾与在线凸优化之间建立了深刻的联系,但AIPW估计量的一个核心挑战在于其底层优化是非凸的。本文提出Sigmoid-FTRL,一种通过同时最小化两个凸遗憾来处理非凸性的自适应实验设计。我们证明,在标准正则性条件下,Sigmoid-FTRL的奈曼遗憾以$T^{-1/2} R$的速率收敛,其中$T$是实验中的受试者数量,$R$是协变量向量的最大范数。此外,我们证明在我们的正则性条件下,任何自适应设计都无法改进$T^{-1/2} R$的速率,从而确立了奈曼遗憾的极小极大速率。最后,我们建立了一个中心极限定理和一个一致保守的方差估计器,这有助于构建渐近有效的沃尔德型置信区间。