In this paper we face the problem of representation of functional data with the tools of algebraic topology. We represent functions by means of merge trees and this representation is compared with that offered by persistence diagrams. We show that these two tree structures, although not equivalent, are both invariant under homeomorphic re-parametrizations of the functions they represent, thus allowing for a statistical analysis which is indifferent to functional misalignment. We employ a novel metric for merge trees and we prove a few theoretical results related to its specific implementation when merge trees represent functions. To showcase the good properties of our topological approach to functional data analysis, we first go through a few examples using data generated {\em in silico} and employed to illustrate and compare the different representations provided by merge trees and persistence diagrams, and then we test it on the Aneurisk65 dataset replicating, from our different perspective, the supervised classification analysis which contributed to make this dataset a benchmark for methods dealing with misaligned functional data.
翻译:本文借助代数拓扑工具处理函数型数据的表示问题。我们通过合并树表示函数,并将该表示与由持续图提供的表示进行比较。我们表明,这两种树结构尽管不等价,但它们对于所表示函数的同胚重参数化均具有不变性,从而允许进行不受函数错位影响的统计分析。我们采用了一种新颖的合并树度量,并证明了当合并树表示函数时,与该度量具体实现相关的若干理论结果。为展示我们拓扑方法在函数型数据分析中的优良特性,我们首先通过若干示例进行说明——使用计算机生成的数据来阐释并比较合并树与持续图提供的不同表示,随后将方法应用于Aneurisk65数据集,从我们不同的角度复现了监督分类分析,该分析使该数据集成为处理错位函数型数据方法的标准基准。