The Glivenko--Cantelli theorem is a uniform version of the strong law of large numbers. It states that for every IID sequence of random variables, the empirical measure converges to the underlying distribution (in the sense of uniform convergence of the CDF). In this work, we provide tools to study such limits of empirical measures in categorical probability. We propose two axioms, namely permutation invariance and empirical adequacy, that a morphism of type $X^{\mathbb{N}} \to X$ should satisfy to be interpretable as taking an infinite sequence as input and producing a sample from its empirical measure as output. Since not all sequences have a well-defined empirical measure, such \emph{empirical sampling morphisms} live in quasi-Markov categories, which, unlike Markov categories, allow for partial morphisms. Given an empirical sampling morphism and a few other properties, we prove representability as well as abstract versions of the de Finetti theorem, the Glivenko--Cantelli theorem and the strong law of large numbers. We provide several concrete constructions of empirical sampling morphisms as partially defined Markov kernels on standard Borel spaces. Instantiating our abstract results then recovers the standard Glivenko--Cantelli theorem and the strong law of large numbers for random variables with finite first moment. Our work thus provides a joint proof of these two theorems in conjunction with the de Finetti theorem from first principles.
翻译:格利文科-坎特利定理是强大数定律的一致版本。它指出,对于每个独立同分布的随机变量序列,经验测度收敛于潜在分布(在累积分布函数一致收敛的意义上)。在本文中,我们提供了在分类概率中研究此类经验测度极限的工具。我们提出了两个公理,即置换不变性和经验充分性,类型为$X^{\mathbb{N}} \to X$的态射应当满足这些条件,才能被解释为将无穷序列作为输入,并输出其经验测度的一个样本。由于并非所有序列都具有定义良好的经验测度,这种*经验采样态射*存在于准马尔可夫范畴中,与马尔可夫范畴不同,准马尔可夫范畴允许偏态射。给定一个经验采样态射及其他几个性质,我们证明了可表示性以及德芬内蒂定理、格利文科-坎特利定理和强大数定律的抽象版本。我们提供了在标准博雷尔空间上作为部分定义马尔可夫核的经验采样态射的几种具体构造。实例化我们的抽象结果后,可以恢复标准格利文科-坎特利定理以及针对具有有限一阶矩的随机变量的强大数定律。因此,我们的工作从第一原理出发,联合证明了这两个定理以及德芬内蒂定理。