We study a variant of the simple hypothesis testing problem where observed samples do not necessarily come from either of the specified distributions, but rather from a close variant of them. In this setting, we require a test that is robust to misspecification and identifies which distribution is closer in Hellinger distance. If the underlying distribution is nearly equidistant from both hypotheses, the problem becomes intractable. Our main result is a lower bound on the slack factor, which quantifies how much closer the underlying distribution must be to one hypothesis relative to the other for any test to remain robust. We also demonstrate the implications of this result for testing with respect to symmetric chi-squared distance. Finally, we study an alternative way to specify robustness, where each hypothesis is a Hellinger ball around a fixed distribution. We provide and analyze a test for this composite hypothesis testing problem.
翻译:我们研究简单假设检验问题的一种变体,其中观测样本未必来自指定的分布之一,而是来自它们的近似变体。在此设定下,我们需要一种对误设定具有鲁棒性的检验方法,能够识别出哪个分布在Hellinger距离下更接近真实分布。若潜在分布与两个假设近乎等距,则问题变得难以处理。我们的主要结果是关于松弛因子的下界,该因子量化了潜在分布相对于另一假设必须更接近某个假设的程度,以使任何检验保持鲁棒性。我们还展示了该结果对于对称卡方距离下检验的意义。最后,我们研究了一种指定鲁棒性的替代方式,其中每个假设是一个围绕固定分布的Hellinger球。针对该复合假设检验问题,我们提出并分析了一种检验方法。