Two-sample testing, where we aim to determine whether two distributions are equal or not equal based on samples from each one, is challenging if we cannot place assumptions on the properties of the two distributions. In particular, certifying equality of distributions, or even providing a tight upper bound on the total variation (TV) distance between the distributions, is impossible to achieve in a distribution-free regime. In this work, we examine the blurred TV distance, a relaxation of TV distance that enables us to perform inference without assumptions on the distributions. We provide theoretical guarantees for distribution-free upper and lower bounds on the blurred TV distance, and examine its properties in high dimensions.
翻译:双样本检验旨在基于两个分布的样本来判断它们是否相等,若无法对两个分布的性质做出假设,则这一任务具有挑战性。特别是,在分布自由的机制下,要证明分布相等,甚至为分布间的总变差(TV)距离提供一个紧的上界,都是不可能实现的。在本工作中,我们研究了模糊TV距离,这是TV距离的一种松弛形式,使我们能够在不对分布做出假设的情况下进行推断。我们为模糊TV距离的分布自由上界和下界提供了理论保证,并考察了其在高维空间中的性质。