In experimental design, Neyman allocation refers to the practice of allocating subjects into treated and control groups, potentially in unequal numbers proportional to their respective standard deviations, with the objective of minimizing the variance of the treatment effect estimator. This widely recognized approach increases statistical power in scenarios where the treated and control groups have different standard deviations, as is often the case in social experiments, clinical trials, marketing research, and online A/B testing. However, Neyman allocation cannot be implemented unless the standard deviations are known in advance. Fortunately, the multi-stage nature of the aforementioned applications allows the use of earlier stage observations to estimate the standard deviations, which further guide allocation decisions in later stages. In this paper, we introduce a competitive analysis framework to study this multi-stage experimental design problem. We propose a simple adaptive Neyman allocation algorithm, which almost matches the information-theoretic limit of conducting experiments. Using online A/B testing data from a social media site, we demonstrate the effectiveness of our adaptive Neyman allocation algorithm, highlighting its practicality even when applied with only a limited number of stages.
翻译:在实验设计中,奈曼分配指将受试者按比例分配至处理组与对照组,其人数可依据各自标准差不等,旨在最小化处理效应估计量的方差。这一广泛认可的方法在社会科学实验、临床试验、市场调研及在线A/B测试等场景中常见,能有效提升统计功效——当处理组与对照组标准差存在差异时尤为显著。然而,奈曼分配的实施需预先已知标准差。幸运的是,前述应用的多阶段特性允许利用早期阶段观测值估计标准差,进而指导后续阶段的分配决策。本文引入竞争分析框架研究这一多阶段实验设计问题,提出一种简单自适应奈曼分配算法,其性能几乎逼近实验的信息论极限。通过某社交媒体平台的在线A/B测试数据,我们验证了该自适应奈曼分配算法的有效性,并突出其在仅有限阶段数下仍具实用性的优势。