We consider the basic statistical problem of detecting truncation of the uniform distribution on the Boolean hypercube by juntas. More concretely, we give upper and lower bounds on the problem of distinguishing between i.i.d. sample access to either (a) the uniform distribution over $\{0,1\}^n$, or (b) the uniform distribution over $\{0,1\}^n$ conditioned on the satisfying assignments of a $k$-junta $f: \{0,1\}^n\to\{0,1\}$. We show that (up to constant factors) $\min\{2^k + \log{n\choose k}, {2^{k/2}\log^{1/2}{n\choose k}}\}$ samples suffice for this task and also show that a $\log{n\choose k}$ dependence on sample complexity is unavoidable. Our results suggest that testing junta truncation requires learning the set of relevant variables of the junta.
翻译:我们考虑布尔超立方体上均匀分布被junta截断这一基本统计检测问题。具体而言,我们针对以下两种分布之间的区分问题给出了上界和下界:通过独立同分布样本访问(a) $\{0,1\}^n$上的均匀分布,或(b) 在$k$-junta函数$f: \{0,1\}^n\to\{0,1\}$的满足赋值条件下$\{0,1\}^n$上的均匀分布。我们证明(在常数因子意义下)$\min\{2^k + \log{n\choose k}, {2^{k/2}\log^{1/2}{n\choose k}}\}$个样本足以完成此任务,并同时表明样本复杂度中$\log{n\choose k}$的依赖性不可避免。我们的结果表明,测试junta截断问题需要学习junta的相关变量集合。