High-dimensional central limit theorems have been intensively studied with most focus being on the case where the data is sub-Gaussian or sub-exponential. However, heavier tails are omnipresent in practice. In this article, we study the critical growth rates of dimension $d$ below which Gaussian approximations are asymptotically valid but beyond which they are not. We are particularly interested in how these thresholds depend on the number of moments $m$ that the observations possess. For every $m\in(2,\infty)$, we construct i.i.d. random vectors $\textbf{X}_1,...,\textbf{X}_n$ in $\mathbb{R}^d$, the entries of which are independent and have a common distribution (independent of $n$ and $d$) with finite $m$th absolute moment, and such that the following holds: if there exists an $\varepsilon\in(0,\infty)$ such that $d/n^{m/2-1+\varepsilon}\not\to 0$, then the Gaussian approximation error (GAE) satisfies $$ \limsup_{n\to\infty}\sup_{t\in\mathbb{R}}\left[\mathbb{P}\left(\max_{1\leq j\leq d}\frac{1}{\sqrt{n}}\sum_{i=1}^n\textbf{X}_{ij}\leq t\right)-\mathbb{P}\left(\max_{1\leq j\leq d}\textbf{Z}_j\leq t\right)\right]=1,$$ where $\textbf{Z} \sim \mathsf{N}_d(\textbf{0}_d,\mathbf{I}_d)$. On the other hand, a result in Chernozhukov et al. (2023a) implies that the left-hand side above is zero if just $d/n^{m/2-1-\varepsilon}\to 0$ for some $\varepsilon\in(0,\infty)$. In this sense, there is a moment-dependent phase transition at the threshold $d=n^{m/2-1}$ above which the limiting GAE jumps from zero to one.
翻译:高维中心极限定理已被深入研究,但多数关注集中于数据为次高斯或次指数的情形。然而,实践中更重的尾部分布普遍存在。本文研究了维度 $d$ 的临界增长率:在该增长率以下,高斯近似渐近有效;而超过该增长率时,近似失效。我们特别关注这些阈值如何依赖于观测数据所拥有的矩阶数 $m$。对于每个 $m\in(2,\infty)$,我们构造了 $\mathbb{R}^d$ 中的独立同分布随机向量 $\textbf{X}_1,...,\textbf{X}_n$,其分量独立且服从相同分布(与 $n$ 和 $d$ 无关),具有有限的 $m$ 阶绝对矩,并满足以下性质:若存在 $\varepsilon\in(0,\infty)$ 使得 $d/n^{m/2-1+\varepsilon}\not\to 0$,则高斯近似误差满足 $$ \limsup_{n\to\infty}\sup_{t\in\mathbb{R}}\left[\mathbb{P}\left(\max_{1\leq j\leq d}\frac{1}{\sqrt{n}}\sum_{i=1}^n\textbf{X}_{ij}\leq t\right)-\mathbb{P}\left(\max_{1\leq j\leq d}\textbf{Z}_j\leq t\right)\right]=1,$$ 其中 $\textbf{Z} \sim \mathsf{N}_d(\textbf{0}_d,\mathbf{I}_d)$。另一方面,Chernozhukov 等人 (2023a) 的结果表明,若存在 $\varepsilon\in(0,\infty)$ 使得 $d/n^{m/2-1-\varepsilon}\to 0$,则上式左侧为零。在此意义上,阈值 $d=n^{m/2-1}$ 处存在一个矩依赖的相变:当维度超过该阈值时,极限高斯近似误差从零跃升至一。