In this paper we face the problem of representation of functional data with the tools of algebraic topology. We represent functions by means of merge trees and this representation is compared with that offered by persistence diagrams. We show that these two structures, although not equivalent, are both invariant under homeomorphic re-parametrizations of the functions they represent, thus allowing for a statistical analysis which is indifferent to functional misalignment. We employ a novel metric for merge trees and we prove some theoretical results related to its specific implementation when merge trees represent functions. To showcase the good properties of our topological approach to functional data analysis, we test it on the Aneurisk65 dataset replicating, from our different perspective, the supervised classification analysis which contributed to make this dataset a benchmark for methods dealing with misaligned functional data.
翻译:本文利用代数拓扑工具处理函数型数据的表示问题。我们通过合并树表示函数,并将该表示与持续同调图提供的表示进行对比。研究表明,尽管这两种结构并非等价,但它们在所表示函数的同胚重参数化下均具有不变性,从而能够实现不受函数错位影响的统计分析。我们采用了一种面向合并树的新型度量,并针对合并树表示函数时的具体实现,证明了若干理论结果。为展示拓扑方法在函数型数据分析中的优良性质,我们在Aneurisk65数据集上进行了测试:从不同视角复现了促成该数据集成为错位函数型数据方法基准的监督分类分析。