Persistent homology is a central methodology in topological data analysis that has been successfully implemented in many fields and is becoming increasingly popular and relevant. The output of persistent homology is a persistence diagram -- a multiset of points supported on the upper half plane -- that is often used as a statistical summary of the topological features of data. In this paper, we study the random nature of persistent homology and estimate the density of expected persistence diagrams from observations using wavelets; we show that our wavelet-based estimator is optimal. Furthermore, we propose an estimator that offers a sparse representation of the expected persistence diagram that achieves near-optimality. We demonstrate the utility of our contributions in a machine learning task in the context of dynamical systems.
翻译:持续同调是拓扑数据分析中的核心方法,已在多个领域得到成功应用,并日益受到关注与重视。持续同调的输出是持久图——支撑在上半平面上的多重集——通常被用作数据拓扑特征的统计摘要。本文研究了持续同调的随机特性,并利用小波从观测数据中估计期望持久图的密度;我们证明了所提出的基于小波的估计量具有最优性。此外,我们提出了一种能够对期望持久图进行稀疏表示的估计量,该估计量实现了近最优性。我们通过在动力学系统背景下的机器学习任务中展示了我们方法的实用性。