We consider the basic statistical problem of detecting truncation of the uniform distribution on the Boolean hypercube by juntas. More concretely, we give upper and lower bounds on the problem of distinguishing between i.i.d. sample access to either (a) the uniform distribution over $\{0,1\}^n$, or (b) the uniform distribution over $\{0,1\}^n$ conditioned on the satisfying assignments of a $k$-junta $f: \{0,1\}^n\to\{0,1\}$. We show that (up to constant factors) $\min\{2^k + \log{n\choose k}, {2^{k/2}\log^{1/2}{n\choose k}}\}$ samples suffice for this task and also show that a $\log{n\choose k}$ dependence on sample complexity is unavoidable. Our results suggest that testing junta truncation requires learning the set of relevant variables of the junta.
翻译:我们考虑布尔超立方体上均匀分布被junta截断的基本统计检验问题。具体而言,对于独立同分布样本访问(a)$\{0,1\}^n$上的均匀分布,或(b)在$k$-junta函数$f: \{0,1\}^n\to\{0,1\}$的可满足赋值条件下$\{0,1\}^n$上的均匀分布,我们给出区分这两种情况的上下界。研究表明(至多相差常数因子)$\min\{2^k + \log{n\choose k}, {2^{k/2}\log^{1/2}{n\choose k}}\}$个样本即可完成该任务,同时证明样本复杂度中$\log{n\choose k}$的依赖项不可避免。我们的结果表明,检验junta截断需要学习junta的相关变量集合。