Wasserstein-p Central Limit Theorem Rates: From Local Dependence to Markov Chains

Non-asymptotic central limit theorem (CLT) rates play a central role in modern machine learning and operations research. In this paper, we study CLT rates for multivariate dependent data in Wasserstein-$p$ ($W_p$) distance, for general $p\ge 1$. We focus on two fundamental dependence structures that commonly arise in practice: locally dependent sequences and geometrically ergodic Markov chains. In both settings, we establish the first optimal $\mathcal O(n^{-1/2})$ rate in $W_1$, as well as the first $W_p$ ($p\ge 2$) CLT rates under mild moment assumptions, substantially improving the best previously known bounds in these dependent-data regimes. As an application of our optimal $W_1$ rate for locally dependent sequences, we further obtain the first optimal $W_1$-CLT rate for multivariate $U$-statistics. On the technical side, we derive a tractable auxiliary bound for $W_1$ Gaussian approximation errors that is well suited for studying dependent data. For Markov chains, we further prove that the regeneration time of the split chain associated with a geometrically ergodic chain has a geometric tail without assuming strong aperiodicity or other restrictive conditions. These tools may be of independent interests and enable our optimal $W_1$ rates and underpin our $W_p$ ($p\ge 2$) results.

翻译：非渐近中心极限定理（CLT）速率在现代机器学习与运筹学中扮演着核心角色。本文研究Wasserstein-p（$W_p$）距离下多元相依数据的CLT速率，其中$p\ge 1$。我们聚焦于实践中常见的两类基本依赖结构：局部依赖序列与几何遍历马尔可夫链。在这两种设定下，我们首次建立了$W_1$距离下的最优$\mathcal O(n^{-1/2})$速率，以及在温和矩假设下首个$p\ge 2$时的$W_p$ CLT速率，显著改进了这些相依数据场景中已知的最佳边界。作为局部依赖序列最优$W_1$速率的一个应用，我们进一步获得了多元U统计量的首个最优$W_1$ CLT速率。在技术层面，我们推导出一个易于处理的$W_1$高斯逼近误差辅助界，该界特别适用于研究相依数据。针对马尔可夫链，我们进一步证明：在无需强非周期性或其他限制性条件的前提下，与几何遍历链相关联的裂链的再生时间具有几何尾部。这些工具可能具有独立研究价值，不仅支撑了我们的最优$W_1$速率，也奠定了$p\ge 2$时$W_p$结果的基础。

相关内容

马尔可夫链

关注 289

马尔可夫链，因安德烈·马尔可夫（A.A.Markov，1856－1922）得名，是指数学中具有马尔可夫性质的离散事件随机过程。该过程中，在给定当前知识或信息的情况下，过去（即当前以前的历史状态）对于预测将来（即当前以后的未来状态）是无关的。在马尔可夫链的每一步，系统根据概率分布，可以从一个状态变到另一个状态，也可以保持当前状态。状态的改变叫做转移，与不同的状态改变相关的概率叫做转移概率。随机漫步就是马尔可夫链的例子。随机漫步中每一步的状态是在图形中的点，每一步可以移动到任何一个相邻的点，在这里移动到每一个点的概率都是相同的（无论之前漫步路径是如何的）。

【斯坦福博士论文】受限条件下的表示学习

专知会员服务

27+阅读 · 2025年3月8日

【新书】《中心极限定理的历史：从经典到现代概率论》，415页pdf

专知会员服务

48+阅读 · 2024年8月28日

【经典书】中心极限定理的历史：从古典到现代概率论

专知会员服务

53+阅读 · 2023年10月20日

李宏毅老师讲解！《AlphaTensor: 用强化学习找出更有效率的矩阵相乘，附Slides与视频

专知会员服务

42+阅读 · 2022年10月15日