The prevailing statistical approach to analyzing persistence diagrams is concerned with filtering out topological noise. In this paper, we adopt a different viewpoint and aim at estimating the actual distribution of a random persistence diagram, which captures both topological signal and noise. To that effect, Chazel and Divol (2019) proved that, under general conditions, the expected value of a random persistence diagram is a measure admitting a Lebesgue density, called the persistence intensity function. In this paper, we are concerned with estimating the persistence intensity function and a novel, normalized version of it -- called the persistence density function. We present a class of kernel-based estimators based on an i.i.d. sample of persistence diagrams and derive estimation rates in the supremum norm. As a direct corollary, we obtain uniform consistency rates for estimating linear representations of persistence diagrams, including Betti numbers and persistence surfaces. Interestingly, the persistence density function delivers stronger statistical guarantees.
翻译:当前对持久性图进行统计分析的主流方法侧重于过滤拓扑噪声。本文采用不同视角,旨在估计随机持久性图的实际分布,该分布同时包含拓扑信号与噪声。为此,Chazel与Divol(2019)证明在一般条件下,随机持久性图的期望值可视为一个存在勒贝格密度的测度,该密度被称为持久性强度函数。本文聚焦于持久性强度函数及其标准化新型变体(即持久性密度函数)的估计问题。我们提出一类基于独立同分布持久性图样本的核估计器,并推导出在最大模范数下的估计速率。作为直接推论,我们获得了持久性图线性表示(包括贝蒂数与持久性曲面)的均匀一致性速率。值得注意的是,持久性密度函数提供了更强的统计保证。