The Glivenko-Cantelli theorem is a uniform version of the strong law of large numbers. It states that for every IID sequence of random variables, the empirical measure converges to the underlying distribution (in the sense of uniform convergence of the CDF). In this work, we provide tools to study such limits of empirical measures in categorical probability. We propose two axioms, permutation invariance and empirical adequacy, that a morphism of type $X^\mathbb{N} \to X$ should satisfy to be interpretable as taking an infinite sequence as input and producing a sample from its empirical measure as output. Since not all sequences have a well-defined empirical measure, ``such empirical sampling morphisms'' live in quasi-Markov categories, which, unlike Markov categories, allow partial morphisms. Given an empirical sampling morphism and a few other properties, we prove representability as well as abstract versions of the de Finetti theorem, the Glivenko-Cantelli theorem and the strong law of large numbers. We provide several concrete constructions of empirical sampling morphisms as partially defined Markov kernels on standard Borel spaces. Instantiating our abstract results then recovers the standard Glivenko-Cantelli theorem and the strong law of large numbers for random variables with finite first moment. Our work thus provides a joint proof of these two theorems in conjunction with the de Finetti theorem from first principles.
翻译:格利文科-坎泰利定理是强大数定律的一致形式。该定理表明:对于任意独立同分布的随机变量序列,经验测度均收敛于底层分布(以累积分布函数一致收敛的意义)。本研究为范畴概率中此类经验测度极限的分析提供了理论工具。我们提出两条公理——置换不变性与经验适定性,要求类型为 $X^\mathbb{N} \to X$ 的态射在可解释为"以无限序列为输入并输出其经验测度样本"时必须满足。由于并非所有序列都具有良定义的经验测度,此类"经验抽样态射"存在于拟马尔可夫范畴中——该范畴允许部分态射,此特性区别于马尔可夫范畴。给定经验抽样态射及若干其他性质,我们证明了可表示性定理,并建立了德菲内蒂定理、格利文科-坎泰利定理及强大数定律的抽象形式。我们通过标准Borel空间上的部分定义马尔可夫核,给出了经验抽样态射的若干具体构造。将抽象结果实例化后,可复原经典格利文科-坎泰利定理及具有有限一阶矩随机变量的强大数定律。因此,本研究从基本原理出发,为这三类定理提供了统一证明框架。