Estimating the second frequency moment of a stream up to $(1\pm\varepsilon)$ multiplicative error requires at most $O(\log n / \varepsilon^2)$ bits of space, due to a seminal result of Alon, Matias, and Szegedy. It is also known that at least $\Omega(\log n + 1/\varepsilon^{2})$ space is needed. We prove an optimal lower bound of $\Omega\left(\log \left(n \varepsilon^2 \right) / \varepsilon^2\right)$ for all $\varepsilon = \Omega(1/\sqrt{n})$. Note that when $\varepsilon>n^{-1/2 + c}$, where $c>0$, our lower bound matches the classic upper bound of AMS. For smaller values of $\varepsilon$ we also introduce a revised algorithm that improves the classic AMS bound and matches our lower bound. Our lower bound holds also for the more general problem of $p$-th frequency moment estimation for the range of $p\in (1,2]$, giving a tight bound in the only remaining range to settle the optimal space complexity of estimating frequency moments.
翻译:由于Alon、Matias和Szegedy的开创性成果,以$(1\pm\varepsilon)$乘性误差估计数据流二阶频率矩最多需要$O(\log n / \varepsilon^2)$比特空间。同时已知至少需要$\Omega(\log n + 1/\varepsilon^{2})$空间。我们证明了对所有$\varepsilon = \Omega(1/\sqrt{n})$,存在$\Omega\left(\log \left(n \varepsilon^2 \right) / \varepsilon^2\right)$的最优下界。值得注意的是,当$\varepsilon>n^{-1/2 + c}$(其中$c>0$)时,我们的下界与经典的AMS上界相匹配。对于更小的$\varepsilon$值,我们还提出了一种改进算法,该算法优化了经典AMS界并与我们的下界相匹配。我们的下界对于更一般的$p$阶频率矩估计问题($p\in (1,2]$范围)同样成立,这为确定频率矩估计最优空间复杂度的最后未决范围提供了紧确界。