We study the problem of designing consistent sequential two-sample tests in a nonparametric setting. Guided by the principle of testing by betting, we reframe this task into that of selecting a sequence of payoff functions that maximize the wealth of a fictitious bettor, betting against the null in a repeated game. In this setting, the relative increase in the bettor's wealth has a precise interpretation as the measure of evidence against the null, and thus our sequential test rejects the null when the wealth crosses an appropriate threshold. We develop a general framework for setting up the betting game for two-sample testing, in which the payoffs are selected by a prediction strategy as data-driven predictable estimates of the witness function associated with the variational representation of some statistical distance measures, such as integral probability metrics (IPMs). We then formally relate the statistical properties of the test~(such as consistency, type-II error exponent and expected sample size) to the regret of the corresponding prediction strategy. We construct a practical sequential two-sample test by instantiating our general strategy with the kernel-MMD metric, and demonstrate its ability to adapt to the difficulty of the unknown alternative through theoretical and empirical results. Our framework is versatile, and easily extends to other problems; we illustrate this by applying our approach to construct consistent tests for the following problems: (i) time-varying two-sample testing with non-exchangeable observations, and (ii) an abstract class of "invariant" testing problems, including symmetry and independence testing.
翻译:我们研究了在非参数设定下设计一致序贯双样本检验的问题。基于“通过打赌进行检验”的原则,我们将此任务重新表述为:选择一组收益函数,使虚拟赌徒在重复博弈中对抗原假设时财富最大化。在该设定下,赌徒财富的相对增长可精确解释为对抗原假设的证据度量,因此我们的序贯检验在财富超过适当阈值时拒绝原假设。我们发展了一个通用框架来构建双样本检验的打赌博弈,其中收益函数由预测策略基于数据驱动的可预测估计量选择,该估计量对应某些统计距离度量(如积分概率度量IPMs)变分表示中的凭证函数。随后,我们正式地将检验的统计性质(如一致性、第二类错误指数和期望样本量)与相应预测策略的遗憾值联系起来。通过实例化基于核MMD度量的通用策略,我们构建了实用的序贯双样本检验,并通过理论与实证结果证明其能自适应未知备择假设的难度。本框架具有通用性,可轻松扩展至其他问题:我们通过将方法应用于以下问题来展示这一点:(i)非可交换观测下的时变双样本检验,以及(ii)包含对称性和独立性检验在内的抽象“不变性”检验问题类别。