Distribution-Free Proofs of Proximity

Motivated by the fact that input distributions are often unknown in advance, distribution-free property testing considers a setting in which the algorithmic task is to accept functions $f : [n] \to \{0,1\}$ having a certain property $\Pi$ and reject functions that are $\epsilon$-far from $\Pi$, where the distance is measured according to an arbitrary and unknown input distribution $D \sim [n]$. As usual in property testing, the tester is required to do so while making only a sublinear number of input queries, but as the distribution is unknown, we also allow a sublinear number of samples from the distribution $D$. In this work we initiate the study of distribution-free interactive proofs of proximity (df-IPP) in which the distribution-free testing algorithm is assisted by an all powerful but untrusted prover. Our main result is a df-IPP for any problem $\Pi \in NC$, with $\tilde{O}(\sqrt{n})$ communication, sample, query, and verification complexities, for any proximity parameter $\epsilon>1/\sqrt{n}$. For such proximity parameters, this result matches the parameters of the best-known general purpose IPPs in the standard uniform setting, and is optimal under reasonable cryptographic assumptions. For general values of the proximity parameter $\epsilon$, our distribution-free IPP has optimal query complexity $O(1/\epsilon)$ but the communication complexity is $\tilde{O}(\epsilon \cdot n + 1/\epsilon)$, which is worse than what is known for uniform IPPs when $\epsilon<1/\sqrt{n}$. With the aim of improving on this gap, we further show that for IPPs over specialised, but large distribution families, such as sufficiently smooth distributions and product distributions, the communication complexity can be reduced to $\epsilon\cdot n\cdot(1/\epsilon)^{o(1)}$ (keeping the query complexity roughly the same as before) to match the communication complexity of the uniform case.

翻译：受输入分布通常事先未知这一事实的驱动，无分布性质检验考虑这样一种设置：算法任务为接受具有特定性质 $\Pi$ 的函数 $f : [n] \to \{0,1\}$，并拒绝与 $\Pi$ 相距 $\epsilon$ 的函数，其中距离是根据任意且未知的输入分布 $D \sim [n]$ 来衡量的。与性质检验中的常规做法一样，检验器需要仅通过次线性数量的输入查询来完成此任务，但由于分布未知，我们还允许从分布 $D$ 中进行次线性数量的采样。在这项工作中，我们首次研究了无分布交互式邻近性证明（df-IPP），其中无分布检验算法得到一个全能但不可信的证明者的辅助。我们的主要成果是：对于任何问题 $\Pi \in NC$，在任意邻近参数 $\epsilon>1/\sqrt{n}$ 下，我们给出了一个具有 $\tilde{O}(\sqrt{n})$ 通信复杂度、样本复杂度、查询复杂度和验证复杂度的 df-IPP。对于这类邻近参数，该结果与标准均匀设置中已知最佳通用 IPP 的参数相匹配，并且在合理的密码学假设下是最优的。对于邻近参数 $\epsilon$ 的一般取值，我们的无分布 IPP 具有最优查询复杂度 $O(1/\epsilon)$，但通信复杂度为 $\tilde{O}(\epsilon \cdot n + 1/\epsilon)$，当 $\epsilon<1/\sqrt{n}$ 时，这比均匀 IPP 已知的结果更差。为了改进这一差距，我们进一步证明：对于专门但广泛的分布族（例如足够平滑的分布和乘积分布）上的 IPP，通信复杂度可以降低到 $\epsilon\cdot n\cdot(1/\epsilon)^{o(1)}$（保持查询复杂度大致与之前相同），从而与均匀情况下的通信复杂度相匹配。