Concentration inequalities for the sample mean, like those due to Bernstein and Hoeffding, are valid for any sample size but overly conservative, yielding confidence intervals that are unnecessarily wide. The central limit theorem (CLT) provides asymptotic confidence intervals with optimal width, but these are invalid for all sample sizes. To resolve this tension, we develop new computable concentration inequalities with asymptotically optimal size, finite-sample validity, and sub-Gaussian decay. These bounds enable the construction of efficient confidence intervals with correct coverage for any sample size and efficient empirical Berry-Esseen bounds that require no prior knowledge of the population variance. We derive our inequalities by tightly bounding non-uniform Kolmogorov and Wasserstein distances to a Gaussian using zero-bias couplings and Stein's method of exchangeable pairs.
翻译:对于样本均值的集中不等式,如Bernstein和Hoeffding不等式,虽适用于任意样本量,但往往过于保守,导致所得置信区间不必要地宽泛。中心极限定理(CLT)提供了具有最优宽度的渐近置信区间,但这些区间对所有有限样本量均无效。为解决这一矛盾,我们提出了一类新的可计算集中不等式,其具有渐近最优尺度、有限样本有效性及次高斯衰减特性。这些界使得我们能够构建适用于任意样本量且覆盖概率正确的高效置信区间,以及无需先验总体方差知识的高效经验Berry-Esseen界。我们通过使用零偏耦合与Stein可交换对方法,对非均匀Kolmogorov距离和Wasserstein距离到高斯分布的距离进行严格界定,从而推导出这些不等式。