\v{C}ech Persistence diagrams (PDs) are topological descriptors routinely used to capture the geometry of complex datasets. They are commonly compared using the Wasserstein distances $OT_{p}$; however, the extent to which PDs are stable with respect to these metrics remains poorly understood. We partially close this gap by focusing on the case where datasets are sampled on an $m$-dimensional submanifold of $\mathbb{R}^{d}$. Under this manifold hypothesis, we show that convergence with respect to the $OT_{p}$ metric happens exactly when $p\gt m$. We also provide improvements upon the bottleneck stability theorem in this case and prove new laws of large numbers for the total $\alpha$-persistence of PDs. Finally, we show how these theoretical findings shed new light on the behavior of the feature maps on the space of PDs that are used in ML-oriented applications of Topological Data Analysis.
翻译:Čech持续图(PDs)是拓扑描述符,常用于捕捉复杂数据集的几何结构。它们通常使用Wasserstein距离$OT_{p}$进行比较;然而,关于PDs相对于这些度量的稳定性程度,目前仍知之甚少。我们通过关注数据集在$\mathbb{R}^{d}$中$m$维子流形上采样的情形,部分填补了这一空白。在此流形假设下,我们证明了关于$OT_{p}$度量的收敛性恰好发生在$p\gt m$时。我们还改进了此情形下的瓶颈稳定性定理,并证明了PDs总$\alpha$-持续性的新大数定律。最后,我们展示了这些理论发现如何为拓扑数据分析在机器学习应用中使用的PDs空间特征映射行为提供新的见解。