In this paper we face the problem of representation of functional data with the tools of algebraic topology. We represent functions by means of merge trees, which, like the more commonly used persistence diagrams, are invariant under homeomorphic reparametrizations of the functions they represent, thus allowing for a statistical analysis which is indifferent to functional misalignment. We consider a recently defined metric for merge trees and we prove some theoretical results related to its specific implementation when merge trees represent functions, establishing also a class of consistent estimators with convergence rates. To showcase the good properties of our topological approach to functional data analysis, we test it on the Aneurisk65 dataset replicating, from our different perspective, the supervised classification analysis which contributed to make this dataset a benchmark for methods dealing with misaligned functional data. In the Appendix we provide an extensive comparison between merge trees and persistence diagrams, highlighting similarities and differences, which can guide the analyst in choosing between the two representations.
翻译:本文探讨了利用代数拓扑工具表示函数数据的问题。我们通过合并树来表示函数,与更常用的持久图类似,合并树对其所表示函数的同胚重参数化具有不变性,从而允许在函数未对齐的情况下进行统计分析。我们考虑了一种最近定义的合并树度量,并证明了当合并树表示函数时该度量具体实现的相关理论结果,同时建立了一类具有收敛速率的一致估计量。为展示我们这种拓扑方法在函数数据分析中的优良特性,我们在Aneurisk65数据集上进行了测试,从我们不同的视角复现了监督分类分析——该分析曾使该数据集成为处理未对齐函数数据方法的基准。在附录中,我们提供了合并树与持久图之间的广泛比较,强调了两者的相似性与差异性,这可为分析者在两种表示方法之间进行选择提供指导。