Samplers are the backbone of the implementations of any randomised algorithm. Unfortunately, obtaining an efficient algorithm to test the correctness of samplers is very hard to find. Recently, in a series of works, testers like $\mathsf{Barbarik}$, $\mathsf{Teq}$, $\mathsf{Flash}$ for testing of some particular kinds of samplers, like CNF-samplers and Horn-samplers, were obtained. But their techniques have a significant limitation because one can not expect to use their methods to test for other samplers, such as perfect matching samplers or samplers for sampling linear extensions in posets. In this paper, we present a new testing algorithm that works for such samplers and can estimate the distance of a new sampler from a known sampler (say, uniform sampler). Testing the identity of distributions is the heart of testing the correctness of samplers. This paper's main technical contribution is developing a new distance estimation algorithm for distributions over high-dimensional cubes using the recently proposed sub-cube conditioning sampling model. Given subcube conditioning access to an unknown distribution $P$, and a known distribution $Q$ defined over $\{0,1\}^n$, our algorithm $\mathsf{CubeProbeEst}$ estimates the variation distance between $P$ and $Q$ within additive error $\zeta$ using $O\left({n^2}/{\zeta^4}\right)$ subcube conditional samples from $P$. Following the testing-via-learning paradigm, we also get a tester which distinguishes between the cases when $P$ and $Q$ are $\varepsilon$-close or $\eta$-far in variation distance with probability at least $0.99$ using $O({n^2}/{(\eta-\varepsilon)^4})$ subcube conditional samples. The estimation algorithm in the sub-cube conditioning sampling model helps us to design the first tester for self-reducible samplers.
翻译:采样器是任何随机算法实现的基石。然而,获得一个用于测试采样器正确性的高效算法极为困难。近期,一系列研究工作(如 $\mathsf{Barbarik}$、$\mathsf{Teq}$、$\mathsf{Flash}$)针对特定类型采样器(如CNF采样器和Horn采样器)的测试取得了成果,但这些方法存在显著局限性,无法直接应用于其他采样器(如完美匹配采样器、偏序集线性扩展采样器)。本文提出了一种适用于此类采样器的新测试算法,能够估计新采样器与已知采样器(如均匀采样器)之间的距离。分布一致性检验是采样器正确性测试的核心。本文的主要技术贡献在于,利用最新提出的子立方体条件采样模型,开发了一种针对高维立方体分布的距离估计算法。在未知分布 $P$ 和定义于 $\{0,1\}^n$ 上的已知分布 $Q$ 均具有子立方体条件采样访问权的条件下,算法 $\mathsf{CubeProbeEst}$ 通过 $O\left({n^2}/{\zeta^4}\right)$ 次 $P$ 的子立方体条件样本,能够在加性误差 $\zeta$ 内估计 $P$ 与 $Q$ 的变分距离。遵循“测试即学习”范式,我们还获得了一个测试器,该测试器在 $P$ 与 $Q$ 的变分距离分别为 $\varepsilon$-接近或 $\eta$-远离时,能以至少 $0.99$ 的概率进行区分,其所需子立方体条件样本量为 $O({n^2}/{(\eta-\varepsilon)^4})$。子立方体条件采样模型中的估计算法,为我们设计首个针对自约简采样器的测试器提供了关键支撑。