The Glivenko--Cantelli theorem is a uniform version of the strong law of large numbers. It states that for every IID sequence of random variables, the empirical measure converges to the underlying distribution (in the sense of uniform convergence of the CDF). In this work, we provide tools to study such limits of empirical measures in categorical probability. We propose two axioms, namely permutation invariance and empirical adequacy, that a morphism of type $X^{\mathbb{N}} \to X$ should satisfy to be interpretable as taking an infinite sequence as input and producing a sample from its empirical measure as output. Since not all sequences have a well-defined empirical measure, such \emph{empirical sampling morphisms} live in quasi-Markov categories, which, unlike Markov categories, allow for partial morphisms. Given an empirical sampling morphism and a few other properties, we prove representability as well as abstract versions of the de Finetti theorem, the Glivenko--Cantelli theorem and the strong law of large numbers. We provide several concrete constructions of empirical sampling morphisms as partially defined Markov kernels on standard Borel spaces. Instantiating our abstract results then recovers the standard Glivenko--Cantelli theorem and the strong law of large numbers for random variables with finite first moment. Our work thus provides a joint proof of these two theorems in conjunction with the de Finetti theorem from first principles.
翻译:格利文科-坎泰利定理是强大数定律的一致收敛形式。该定理表明:对于任意独立同分布的随机变量序列,经验测度均收敛于其基础分布(以累积分布函数一致收敛为度量)。本研究为范畴概率论中此类经验测度极限的分析提供了理论工具。我们提出两条公理——置换不变性与经验适定性——用以刻画形如$X^{\mathbb{N}} \to X$的态射:当该态射可解释为以无限序列为输入、以其经验测度为分布进行采样输出时,应当满足这两条公理。由于并非所有序列都具有良定义的经验测度,此类\emph{经验采样态射}存在于拟马尔可夫范畴中(该范畴允许部分态射的存在,此特性区别于马尔可夫范畴)。给定经验采样态射及若干附加性质,我们证明了可表示性定理,并建立了抽象形式的德菲内蒂定理、格利文科-坎泰利定理及强大数定律。我们通过标准Borel空间上的部分定义马尔可夫核,给出了经验采样态射的若干具体构造。将抽象结论实例化后,可复原经典理论中具有有限一阶矩的随机变量所满足的格利文科-坎泰利定理与强大数定律。因此,本研究从第一性原理出发,为德菲内蒂定理、格利文科-坎泰利定理及强大数定律提供了统一的理论框架与联合证明。