In experimental design, Neyman allocation refers to the practice of allocating subjects into treated and control groups, potentially in unequal numbers proportional to their respective standard deviations, with the objective of minimizing the variance of the treatment effect estimator. This widely recognized approach increases statistical power in scenarios where the treated and control groups have different standard deviations, as is often the case in social experiments, clinical trials, marketing research, and online A/B testing. However, Neyman allocation cannot be implemented unless the standard deviations are known in advance. Fortunately, the multi-stage nature of the aforementioned applications allows the use of earlier stage observations to estimate the standard deviations, which further guide allocation decisions in later stages. In this paper, we introduce a competitive analysis framework to study this multi-stage experimental design problem. We propose a simple adaptive Neyman allocation algorithm, which almost matches the information-theoretic limit of conducting experiments. Using online A/B testing data from a social media site, we demonstrate the effectiveness of our adaptive Neyman allocation algorithm, highlighting its practicality especially when applied with only a limited number of stages.
翻译:在实验设计中,尼曼分配是指将受试者分配到处理组和对照组的一种实践方法,其分配数量可能不等,且与各自标准差成比例,目的是最小化处理效应估计量的方差。这一广为人知的方法在处理组和对照组标准差不同的场景下能增强统计功效,这在社会实验、临床试验、市场研究和在线A/B测试中经常出现。然而,尼曼分配只有事先知道标准差才能实施。幸运的是,上述应用的多阶段特性允许利用早期阶段的观测值来估计标准差,进而指导后续阶段的分配决策。本文引入了一个竞争分析框架来研究这一多阶段实验设计问题。我们提出了一种简单的自适应尼曼分配算法,该算法几乎达到了进行实验的信息论极限。利用来自社交媒体网站的在线A/B测试数据,我们验证了自适应尼曼分配算法的有效性,突出了其在阶段数有限的情况下应用时的实用性。