Multi-dimensional Approximate Counting

The celebrated Morris counter uses $\log_2\log_2 n + O(\log_2 \sigma^{-1})$ bits to count up to $n$ with a relative error $\sigma$, where if $\hat{\lambda}$ is the estimate of the current count $\lambda$, then $\mathbb{E}|\hat{\lambda}-\lambda|^2 <\sigma^2\lambda^2$. A natural generalization is \emph{multi-dimensional} approximate counting. Let $d\geq 1$ be the dimension. The count vector $x\in \mathbb{N}^d$ is incremented entry-wisely over a stream of coordinates $(w_1,\ldots,w_n)\in [d]^n$, where upon receiving $w_k\in[d]$, $x_{w_k}\gets x_{w_k}+1$. A \emph{$d$-dimensional approximate counter} is required to count $d$ coordinates simultaneously and return an estimate $\hat{x}$ of the count vector $x$. Aden-Ali, Han, Nelson, and Yu \cite{aden2022amortized} showed that the trivial solution of using $d$ Morris counters that track $d$ coordinates separately is already optimal in space, \emph{if each entry only allows error relative to itself}, i.e., $\mathbb{E}|\hat{x}_j-x_j|^2<\sigma^2|x_j|^2$ for each $j\in [d]$. However, for another natural error metric -- the \emph{Euclidean mean squared error} $\mathbb{E} |\hat{x}-x|^2$ -- we show that using $d$ separate Morris counters is sub-optimal. In this work, we present a simple and optimal $d$-dimensional counter with Euclidean relative error $\sigma$, i.e., $\mathbb{E} |\hat{x}-x|^2 <\sigma^2|x|^2$ where $|x|=\sqrt{\sum_{j=1}^d x_j^2}$, with a matching lower bound. The upper and lower bounds are proved with ideas that are strikingly simple. The upper bound is constructed with a certain variable-length integer encoding and the lower bound is derived from a straightforward volumetric estimation of sphere covering.

翻译：著名的 Morris 计数器使用 $\log_2\log_2 n + O(\log_2 \sigma^{-1})$ 比特来计数至 $n$，其相对误差为 $\sigma$，其中若 $\hat{\lambda}$ 是当前计数 $\lambda$ 的估计值，则满足 $\mathbb{E}|\hat{\lambda}-\lambda|^2 <\sigma^2\lambda^2$。一个自然的推广是\emph{多维}近似计数。令 $d\geq 1$ 为维度。计数向量 $x\in \mathbb{N}^d$ 在坐标流 $(w_1,\ldots,w_n)\in [d]^n$ 上按分量递增，即每当接收到 $w_k\in[d]$，执行 $x_{w_k}\gets x_{w_k}+1$。一个\emph{$d$维近似计数器}需要同时计数 $d$ 个坐标并返回计数向量 $x$ 的一个估计值 $\hat{x}$。Aden-Ali、Han、Nelson 和 Yu \cite{aden2022amortized} 证明，如果每个分量只允许相对于其自身的误差，即对于每个 $j\in [d]$ 满足 $\mathbb{E}|\hat{x}_j-x_j|^2<\sigma^2|x_j|^2$，那么使用 $d$ 个独立的 Morris 计数器分别追踪 $d$ 个坐标的平凡方案在空间上已经是最优的。然而，对于另一个自然的误差度量——\emph{欧几里得均方误差} $\mathbb{E} |\hat{x}-x|^2$——我们证明使用 $d$ 个独立的 Morris 计数器是次优的。在这项工作中，我们提出了一种简单且最优的 $d$ 维计数器，其欧几里得相对误差为 $\sigma$，即满足 $\mathbb{E} |\hat{x}-x|^2 <\sigma^2|x|^2$，其中 $|x|=\sqrt{\sum_{j=1}^d x_j^2}$，并给出了匹配的下界。上下界的证明思路异常简洁。上界通过一种特定的变长整数编码构造，下界则源于对球覆盖体积的直接估计。