Two-sample testing, where we aim to determine whether two distributions are equal or not equal based on samples from each one, is challenging if we cannot place assumptions on the properties of the two distributions. In particular, certifying equality of distributions, or even providing a tight upper bound on the total variation (TV) distance between the distributions, is impossible to achieve in a distribution-free regime. In this work, we examine the blurred TV distance, a relaxation of TV distance that enables us to perform inference without assumptions on the distributions. We provide theoretical guarantees for distribution-free upper and lower bounds on the blurred TV distance, and examine its properties in high dimensions.
翻译:双样本检验旨在根据来自两个分布的样本判断它们是否相等,当无法对这两个分布的性质施加假设时,该问题极具挑战性。具体而言,在无分布设定下,验证分布的相等性,甚至为分布之间的总变差(TV)距离提供紧的上界,均是难以实现的。本文研究了模糊总变差距离——一种对总变差距离的松弛,使我们无需对分布作假设即可进行推断。我们为模糊总变差距离的无分布上界与下界提供了理论保证,并探讨了其在高维情形下的性质。