\v{C}ech Persistence diagrams (PDs) are topological descriptors routinely used to capture the geometry of complex datasets. They are commonly compared using the Wasserstein distances $OT_{p}$; however, the extent to which PDs are stable with respect to these metrics remains poorly understood. We partially close this gap by focusing on the case where datasets are sampled on an $m$-dimensional submanifold of $\mathbb{R}^{d}$. Under this manifold hypothesis, we show that convergence with respect to the $OT_{p}$ metric happens exactly when $p\gt m$. We also provide improvements upon the bottleneck stability theorem in this case and prove new laws of large numbers for the total $\alpha$-persistence of PDs. Finally, we show how these theoretical findings shed new light on the behavior of the feature maps on the space of PDs that are used in ML-oriented applications of Topological Data Analysis.
翻译:切赫持续图(PDs)是常用于捕捉复杂数据集几何结构的拓扑描述符。它们通常通过Wasserstein距离$OT_{p}$进行比较;然而,关于PDs相对于这些度量的稳定性程度,目前仍缺乏深入理解。我们通过聚焦于数据集在$\mathbb{R}^{d}$中$m$维子流形上采样的情形,部分填补了这一空白。在此流形假设下,我们证明了关于$OT_{p}$度量的收敛性恰好发生在$p\gt m$时。我们还改进了该情形下的瓶颈稳定性定理,并证明了PDs总$\alpha$-持续性的新大数定律。最后,我们展示了这些理论发现如何为拓扑数据分析在机器学习应用中使用的PDs空间特征映射行为提供新的见解。