Time-uniform central limit theory and asymptotic confidence sequences

Confidence intervals based on the central limit theorem (CLT) are a cornerstone of classical statistics. Despite being only asymptotically valid, they are ubiquitous because they permit statistical inference under very weak assumptions, and can often be applied to problems even when nonasymptotic inference is impossible. This paper introduces time-uniform analogues of such asymptotic confidence intervals. To elaborate, our methods take the form of confidence sequences (CS) -- sequences of confidence intervals that are uniformly valid over time. CSs provide valid inference at arbitrary stopping times, incurring no penalties for "peeking" at the data, unlike classical confidence intervals which require the sample size to be fixed in advance. Existing CSs in the literature are nonasymptotic, and hence do not enjoy the aforementioned broad applicability of asymptotic confidence intervals. Our work bridges the gap by giving a definition for "asymptotic CSs", and deriving a universal asymptotic CS that requires only weak CLT-like assumptions. While the CLT approximates the distribution of a sample average by that of a Gaussian at a fixed sample size, we use strong invariance principles (stemming from the seminal 1960s work of Strassen and improvements by Koml\'os, Major, and Tusn\'ady) to uniformly approximate the entire sample average process by an implicit Gaussian process. As an illustration of our theory, we derive asymptotic CSs for the average treatment effect using efficient estimators in observational studies (for which no nonasymptotic bounds can exist even in the fixed-time regime) as well as randomized experiments, enabling causal inference that can be continuously monitored and adaptively stopped.

翻译：基于中心极限定理（CLT）的置信区间是经典统计学的基石。尽管其仅具有渐近有效性，但由于能在极弱假设下进行统计推断，且常可应用于非渐近推断无法实现的问题，这类方法得以广泛使用。本文提出了此类渐近置信区间的时间均匀模拟形式。具体而言，我们的方法采用置信序列（CS）形式——即随时间均匀有效的置信区间序列。与需要预先固定样本量的经典置信区间不同，CS允许在任意停止时间进行有效推断，且不会因"偷窥"数据而招致惩罚。现有文献中的CS均为非渐近方法，因此不具备上述渐近置信区间的广泛适用性。本研究通过定义"渐近CS"并推导出仅需弱CLT类假设的通用渐近CS，填补了这一空白。CLT在固定样本量下用高斯分布近似样本均值分布，而我们则利用强不变原理（源自Strassen在1960年代的开创性工作及Komlós、Major和Tusnády的改进）将整个样本均值过程均匀地近似为隐式高斯过程。作为理论应用示例，我们利用观察性研究（即使在固定时间框架下也不存在非渐近界）和随机实验中的高效估计量，推导出平均处理效应的渐近CS，从而实现可持续监控和自适应停止的因果推断。