We study the problem of designing consistent sequential two-sample tests in a nonparametric setting. Guided by the principle of testing by betting, we reframe this task into that of selecting a sequence of payoff functions that maximize the wealth of a fictitious bettor, betting against the null in a repeated game. In this setting, the relative increase in the bettor's wealth has a precise interpretation as the measure of evidence against the null, and thus our sequential test rejects the null when the wealth crosses an appropriate threshold. We develop a general framework for setting up the betting game for two-sample testing, in which the payoffs are selected by a prediction strategy as data-driven predictable estimates of the witness function associated with the variational representation of some statistical distance measures, such as integral probability metrics (IPMs). We then formally relate the statistical properties of the test~(such as consistency, type-II error exponent and expected sample size) to the regret of the corresponding prediction strategy. We construct a practical sequential two-sample test by instantiating our general strategy with the kernel-MMD metric, and demonstrate its ability to adapt to the difficulty of the unknown alternative through theoretical and empirical results. Our framework is versatile, and easily extends to other problems; we illustrate this by applying our approach to construct consistent tests for the following problems: (i) time-varying two-sample testing with non-exchangeable observations, and (ii) an abstract class of "invariant" testing problems, including symmetry and independence testing.
翻译:我们研究在非参数设定下设计一致性序贯双样本检验的问题。基于"对赌检验"原则,我们将该任务重构为选择一系列收益函数以最大化虚拟赌徒财富的过程——该赌徒在重复博弈中押注于原假设的对立面。在此框架中,赌徒财富的相对增长可精确解释为反对原假设的证据强度,因此我们的序贯检验会在财富跨越预设阈值时拒绝原假设。我们建立了一个用于双样本检验对赌博弈的通用框架,其中收益函数通过预测策略选取,这些策略会基于数据驱动的可预测估计量来生成与某些统计距离测度(如积分概率度量IPM)变分表示相关的见证函数。随后我们正式建立了检验的统计性质(包括一致性、第二类错误指数和期望样本量)与对应预测策略遗憾值之间的关联。通过采用核-MMD度量实例化我们的通用策略,我们构建了实用的序贯双样本检验,并通过理论与实证结果证明其能适应未知备择假设的难度。该框架具有通用性,可轻易扩展至其他问题:我们通过将其应用于构建以下问题的一致性检验来加以说明:(i) 基于非可交换观测的时变双样本检验,以及(ii) 包含对称性检验和独立性检验在内的抽象"不变性"检验问题类。