This paper addresses the statistical problem of estimating the infinite-norm deviation from the empirical mean to the distribution mean for high-dimensional distributions on $\{0,1\}^d$, with potentially $d=\infty$. Unlike traditional bounds as in the classical Glivenko-Cantelli theorem, we explore the instance-dependent convergence behavior. For product distributions, we provide the exact non-asymptotic behavior of the expected maximum deviation, revealing various regimes of decay. In particular, these tight bounds recover the known asymptotic sub-Gaussian behavior, and demonstrate the necessity of a previously proposed factor for an upper bound, answering a corresponding COLT 2023 open problem.
翻译:本文研究高维分布(可能为d=∞)在{0,1}^d上的统计问题,即估计经验均值与分布均值的无穷范数偏差。与经典Glivenko-Cantelli定理中的传统界不同,我们探索了依赖实例的收敛行为。对于乘积分布,我们给出了期望最大偏差的精确非渐近行为,揭示了多种衰减区域。特别地,这些紧界恢复了已知的渐近次高斯行为,并证明了先前提出的上界因子的必要性,从而解答了COLT 2023的一个相应开放问题。