This paper studies sample-size design for finite-population test-and-roll experiments, where a decision-maker first conducts an experiment on $m$ units and then assigns the remaining $N-m$ units to the treatment that performs better in the experiment. We consider welfare-aware sample-size choice, which involves an exploration-exploitation tradeoff: larger experiments improve the rollout decision but impose welfare losses on experimental units assigned to the inferior treatment. We show that the standard absolute minimax regret criterion can lead to implausibly small experiments by over-penalizing exploration in its worst-case objective. To address this limitation, we propose the Worst-case Marginal Benefit (WMB) rule, which compares the worst-case marginal benefit of adding one more matched pair to the experiment with the corresponding marginal exploration cost. We establish a simple rule-of-thirds benchmark. For Bernoulli outcomes, after excluding pathological cases, the WMB criterion yields the optimal sample size of $m \approx N/3$ through a Gaussian approximation. For Gaussian outcomes with a known common variance, the same benchmark arises exactly. These results provide a prior-free and practically implementable guide for welfare-based sample-size design.
翻译:本文研究有限总体下测试与推广实验的样本量设计问题,其中决策者首先对 $m$ 个单元进行实验,随后将剩余的 $N-m$ 个单元分配给实验中表现更优的处理方案。我们考虑福利导向的样本量选择,这涉及到探索与利用之间的权衡:扩大实验规模虽能改进推广决策,但会因将实验单元分配至劣质处理方案而造成福利损失。研究表明,标准绝对极小化最大遗憾准则可能因在极端情形下过度惩罚探索行为,导致不合理的极小实验规模。为克服这一局限,我们提出最坏情形边际效益(WMB)准则,通过比较向实验额外增加一对匹配单元的最坏情形边际效益与相应的边际探索成本,建立简洁的三分律基准。对于伯努利型结果,在排除病态情形后,借助高斯近似可得WMB准则下的最优样本量为 $m \approx N/3$。对于已知公共方差的高斯型结果,同样可精确推导该基准。这些结果为基于福利的样本量设计提供了无需先验信息且可实际操作的指导方案。