In statistics, independent, identically distributed random samples do not carry a natural ordering, and their statistics are typically invariant with respect to permutations of their order. Thus, an $n$-sample in a space $M$ can be considered as an element of the quotient space of $M^n$ modulo the permutation group. The present paper takes this definition of sample space and the related concept of orbit types as a starting point for developing a geometric perspective on statistics. We aim at deriving a general mathematical setting for studying the behavior of empirical and population means in spaces ranging from smooth Riemannian manifolds to general stratified spaces. We fully describe the orbifold and path-metric structure of the sample space when $M$ is a manifold or path-metric space, respectively. These results are non-trivial even when $M$ is Euclidean. We show that the infinite sample space exists in a Gromov-Hausdorff type sense and coincides with the Wasserstein space of probability distributions on $M$. We exhibit Fr\'echet means and $k$-means as metric projections onto 1-skeleta or $k$-skeleta in Wasserstein space, and we define a new and more general notion of polymeans. This geometric characterization via metric projections applies equally to sample and population means, and we use it to establish asymptotic properties of polymeans such as consistency and asymptotic normality.
翻译:在统计学中,独立同分布的随机样本不存在自然顺序,其统计量通常对样本顺序的置换保持不变。因此,空间 $M$ 中的 $n$ 样本可视为 $M^n$ 模去置换群的商空间中的元素。本文以样本空间这一定义及轨道类型相关概念为出发点,发展统计学的几何视角。我们旨在建立统一的数学框架,用于研究从光滑黎曼流形到一般分层空间中经验均值与总体均值的行为。当 $M$ 为流形或路径度量空间时,我们完整描述了样本空间的轨形结构与路径度量结构。即使当 $M$ 为欧几里得空间时,这些结果也并非平凡。我们证明无限样本空间在Gromov-Hausdorff意义下存在,并与 $M$ 上概率分布的Wasserstein空间一致。我们将Fr\'echet均值和 $k$-均值表征为Wasserstein空间中向1-骨架或 $k$-骨架的度量投影,并定义了一种更一般的新型多项式均值。这种基于度量投影的几何刻画同时适用于样本均值与总体均值,我们利用它建立了多项式均值的渐近性质,如相合性与渐近正态性。