In two-sampling testing, one observes two independent sequences of independent and identically distributed random variables distributed according to the distributions $P_1$ and $P_2$ and wishes to decide whether $P_1=P_2$ (null hypothesis) or $P_1\neq P_2$ (alternative hypothesis). The Gutman test for this problem compares the empirical distributions of the observed sequences and decides on the null hypothesis if the Jensen-Shannon (JS) divergence between these empirical distributions is below a given threshold. This paper proposes a generalization of the Gutman test, termed \emph{divergence test}, which replaces the JS divergence by an arbitrary divergence. For this test, the exponential decay of the type-II error probability for a fixed type-I error probability is studied. First, it is shown that the divergence test achieves the optimal first-order exponent, irrespective of the choice of divergence. Second, it is demonstrated that the divergence test with an invariant divergence achieves the same second-order asymptotics as the Gutman test. In addition, it is shown that the Gutman test is the GLRT for the two-sample testing problem, and a connection between two-sample testing and robust goodness-of-fit testing is established.
翻译:在双样本检验中,我们观测到两个独立的独立同分布随机变量序列,分别服从分布 $P_1$ 和 $P_2$,目标是判断 $P_1=P_2$(零假设)还是 $P_1\neq P_2$(备择假设)。针对该问题的Gutman检验通过比较观测序列的经验分布,并判断这些经验分布之间的Jensen-Shannon(JS)散度是否低于给定阈值来决定是否接受零假设。本文提出了一种Gutman检验的推广形式,称为**散度检验**,其将JS散度替换为任意散度。针对该检验,本文研究了在固定第一类错误概率下第二类错误概率的指数衰减率。首先,证明了无论选择何种散度,散度检验均能达到最优的一阶指数衰减率。其次,证明了采用不变散度的散度检验能达到与Gutman检验相同的二阶渐近性能。此外,本文还证明了Gutman检验是双样本检验问题的广义似然比检验(GLRT),并建立了双样本检验与稳健拟合优度检验之间的联系。