Likelihood-free hypothesis testing

Consider the problem of testing $Z \sim \mathbb P^{\otimes m}$ vs $Z \sim \mathbb Q^{\otimes m}$ from $m$ samples. Generally, to achieve a small error rate it is necessary and sufficient to have $m \asymp 1/\epsilon^2$, where $\epsilon$ measures the separation between $\mathbb P$ and $\mathbb Q$ in total variation ($\mathsf{TV}$). Achieving this, however, requires complete knowledge of the distributions $\mathbb P$ and $\mathbb Q$ and can be done, for example, using the Neyman-Pearson test. In this paper we consider a variation of the problem, which we call likelihood-free (or simulation-based) hypothesis testing, where access to $\mathbb P$ and $\mathbb Q$ (which are a priori only known to belong to a large non-parametric family $\mathcal P$) is given through $n$ i.i.d. samples from each. We demostrate existence of a fundamental trade-off between $n$ and $m$ given by $nm \asymp n^2_\mathsf{GoF}(\epsilon,\mathcal P)$, where $n_\mathsf{GoF}$ is the minimax sample complexity of testing between the hypotheses $H_0: \mathbb P= \mathbb Q$ vs $H_1: \mathsf{TV}(\mathbb P,\mathbb Q) \ge \epsilon$. We show this for three non-parametric families $\cal P$: $\beta$-smooth densities over $[0,1]^d$, the Gaussian sequence model over a Sobolev ellipsoid, and the collection of distributions $\mathcal P$ on a large alphabet $[k]$ with pmfs bounded by $c/k$ for fixed $c$. The test that we propose (based on the $L^2$-distance statistic of Ingster) simultaneously achieves all points on the tradeoff curve for these families. In particular, when $m\gg 1/\epsilon^2$ our test requires the number of simulation samples $n$ to be orders of magnitude smaller than what is needed for density estimation with accuracy $\asymp \epsilon$ (under $\mathsf{TV}$). This demonstrates the possibility of testing without fully estimating the distributions.

翻译：考虑基于$m$个样本检验$Z \sim \mathbb P^{\otimes m}$与$Z \sim \mathbb Q^{\otimes m}$的问题。通常，实现较小的错误率需要且仅需要$m \asymp 1/\epsilon^2$，其中$\epsilon$度量$\mathbb P$与$\mathbb Q$在总变差距离（$\mathsf{TV}$）下的分离度。然而，实现这一结果需要完全知晓分布$\mathbb P$和$\mathbb Q$，并可借助例如Neyman-Pearson检验等方法完成。本文考虑该问题的变体，称为无似然（或基于模拟的）假设检验，其中通过从$\mathbb P$和$\mathbb Q$（两者先验仅知属于某个大规模非参数族$\mathcal P$）中各自抽取$n$个独立同分布样本来获取信息。我们证明了$n$与$m$之间存在基本权衡关系：$nm \asymp n^2_\mathsf{GoF}(\epsilon,\mathcal P)$，其中$n_\mathsf{GoF}$是检验原假设$H_0: \mathbb P= \mathbb Q$与备择假设$H_1: \mathsf{TV}(\mathbb P,\mathbb Q) \ge \epsilon$的极小化极大样本复杂度。我们对三个非参数族$\cal P$证明了这一结果：$[0,1]^d$上的$\beta$-光滑密度族、Sobolev椭球上的高斯序列模型、以及固定$c$下支撑集为$[k]$且概率质量函数有界于$c/k$的分布族$\mathcal P$。我们提出的检验（基于Ingster的$L^2$-距离统计量）可同时达到这些族的权衡曲线上所有点。特别地，当$m\gg 1/\epsilon^2$时，我们的检验所需的模拟样本数$n$比在$\mathsf{TV}$下达到精度$\asymp \epsilon$的密度估计所需样本量小多个数量级。这表明了无需完全估计分布即可进行检验的可能性。