In this paper, we study the problem of mean estimation under strict 1-bit communication constraints. We propose a novel adaptive mean estimator based solely on randomized threshold queries, where each 1-bit outcome indicates whether a given sample exceeds a sequentially chosen threshold. Our estimator is $(ε, δ)$-PAC for any distribution with a bounded mean $μ\in [-λ, λ]$ and a bounded $k$-th central moment $\mathbb{E}[|X-μ|^k] \le σ^k$ for any fixed $k > 1$. Crucially, our sample complexity is order-optimal in all such tail regimes, i.e., for every such $k$ value. For $k \neq 2$, our estimator's sample complexity matches the unquantized minimax lower bounds plus an unavoidable $O(\log(λ/σ))$ localization cost. For the finite-variance case ($k=2$), our estimator's sample complexity has an extra multiplicative $O(\log(σ/ε))$ penalty, and we establish a novel information-theoretic lower bound showing that this penalty is a fundamental limit of 1-bit quantization. We also establish a significant adaptivity gap: for both threshold queries and more general interval queries, the sample complexity of any non-adaptive estimator must scale linearly with the search space parameter $λ/σ$, rendering it vastly less sample efficient than our adaptive approach. Finally, we present algorithmic variants that (i) handle an unknown sampling budget, (ii) adapt to an unknown scale parameter~$σ$ given (possibly loose) bounds, and (iii) require only two stages of adaptivity at the expense of more complicated general 1-bit queries.
翻译:本文研究了严格1比特通信约束下的均值估计问题。我们提出了一种基于随机阈值查询的新型自适应均值估计器,其中每个1比特输出指示给定样本是否超过顺序选择的阈值。对于任意具有有界均值 $μ\in [-λ, λ]$ 和固定 $k > 1$ 下有界 $k$ 阶中心矩 $\mathbb{E}[|X-μ|^k] \le σ^k$ 的分布,我们的估计器满足 $(ε, δ)$-PAC 准则。关键在于,在所有此类尾分布场景中(即对每个 $k$ 值),我们的样本复杂度达到顺序最优。对于 $k \neq 2$ 的情况,估计器的样本复杂度与未量化极小极大下界相匹配,仅附加不可避免的 $O(\log(λ/σ))$ 定位代价。对于有限方差情形($k=2$),估计器的样本复杂度存在额外乘法因子 $O(\log(σ/ε))$ 的惩罚,我们通过建立新的信息论下界证明该惩罚是1比特量化的根本限制。我们还揭示了显著的自适应性差距:对于阈值查询和更一般的区间查询,任何非自适应估计器的样本复杂度都必须随搜索空间参数 $λ/σ$ 线性增长,使其样本效率远低于我们的自适应方法。最后,我们提出算法变体:(i)处理未知采样预算;(ii)在给定(可能宽松的)边界下自适应未知尺度参数 $σ$;(iii)仅需两阶段自适应性(以使用更复杂的通用1比特查询为代价)。