Adaptive experiments use preliminary analyses of the data to inform further course of action and are commonly used in many disciplines including medical and social sciences. Because the null hypothesis and experimental design are not pre-specified, it has long been recognized that statistical inference for adaptive experiments is not straightforward. Most existing methods only apply to specific adaptive designs and rely on strong assumptions. In this work, we propose selective randomization inference as a general framework for analyzing adaptive experiments. In a nutshell, our approach applies conditional post-selection inference to randomization tests. By using directed acyclic graphs to describe the data generating process, we derive a selective randomization p-value that controls the selective type-I error without requiring independent and identically distributed data or any other modelling assumptions. We show how rejection sampling and Markov Chain Monte Carlo can be used to compute the selective randomization p-values and construct confidence intervals for a homogeneous treatment effect. To mitigate the risk of disconnected confidence intervals, we propose the use of hold-out units. Lastly, we demonstrate our method and compare it with other randomization tests using synthetic and real-world data.
翻译:自适应实验通过数据的初步分析来指导后续行动方案,广泛应用于医学和社会科学等多个领域。由于原假设和实验设计并非预先设定,长期以来人们认识到自适应实验的统计推断并非易事。现有方法大多仅适用于特定的自适应设计并依赖较强的假设条件。本文提出将选择性随机化推断作为分析自适应实验的通用框架。简言之,我们的方法将条件后选择推断应用于随机化检验。通过使用有向无环图描述数据生成过程,我们推导出能控制选择性第一类错误的选择性随机化p值,且无需独立同分布数据或任何其他建模假设。我们展示了如何运用拒绝采样和马尔可夫链蒙特卡洛方法计算选择性随机化p值,并为同质处理效应构建置信区间。为降低置信区间不连续的风险,我们提出使用留出单元。最后,我们通过合成数据与真实数据展示该方法,并将其与其他随机化检验进行比较。