Randomization testing is a fundamental method in statistics, enabling inferential tasks such as testing for (conditional) independence of random variables, constructing confidence intervals in semiparametric location models, and constructing (by inverting a permutation test) model-free prediction intervals via conformal inference. Randomization tests are exactly valid for any sample size, but their use is generally confined to exchangeable data. Yet in many applications, data is routinely collected adaptively via, e.g., (contextual) bandit and reinforcement learning algorithms or adaptive experimental designs. In this paper we present a general framework for randomization testing on adaptively collected data (despite its non-exchangeability) that uses a novel weighted randomization test, for which we also present novel computationally tractable resampling algorithms for various popular adaptive assignment algorithms, data-generating environments, and types of inferential tasks. Finally, we demonstrate via a range of simulations the efficacy of our framework for both testing and confidence/prediction interval construction.
翻译:随机化检验是统计学中的一种基本方法,能够完成多种推断任务,例如检验随机变量的(条件)独立性、构建半参数位置模型中的置信区间,以及通过逆推置换检验利用共形推断构建无模型预测区间。随机化检验对任意样本量都严格有效,但其应用通常局限于可交换数据。然而在许多实际应用中,数据往往通过(情境)多臂赌博机、强化学习算法或适应性实验设计等适应性方式常规收集。本文提出一个适用于适应性收集数据(尽管其具有非可交换性)的随机化检验通用框架,该框架采用一种新颖的加权随机化检验方法。我们同时为多种常见自适应分配算法、数据生成环境及推断任务类型,提出了计算可行的新型重抽样算法。最后,通过一系列仿真实验,我们验证了该框架在检验、置信区间和预测区间构建中的有效性。