We characterize structures such as monotonicity, convexity, and modality in smooth regression curves using persistent homology. Persistent homology is a key tool in topological data analysis that detects higher dimensional topological features such as connected components and holes (cycles or loops) in the data. In other words, persistent homology is a multiscale version of homology that characterizes sets based on the connected components and holes. We use super-level sets of functions to extract geometric features via persistent homology. In particular, we explore structures in regression curves via the persistent homology of super-level sets of a function, where the function of interest is - the first derivative of the regression function. In the course of this study, we extend an existing procedure of estimating the persistent homology for the first derivative of a regression function and establish its consistency. Moreover, as an application of the proposed methodology, we demonstrate that the persistent homology of the derivative of a function can reveal hidden structures in the function that are not visible from the persistent homology of the function itself. In addition, we also illustrate that the proposed procedure can be used to compare the shapes of two or more regression curves which is not possible merely from the persistent homology of the function itself.
翻译:我们利用持续同调表征平滑回归曲线中的单调性、凸性和模态等结构。持续同调是拓扑数据分析中的关键工具,能够检测数据中连通分量和孔洞(环或回路)等高维拓扑特征。换言之,持续同调是同调的多尺度版本,通过连通分量与孔洞来刻画集合。我们利用函数的上水平集通过持续同调提取几何特征。具体而言,借助函数的上水平集的持续同调探索回归曲线结构,所关注的函数为回归函数的一阶导数。在研究过程中,我们对回归函数一阶导数的持续同调估计方法进行了扩展,并建立其一致性。此外,作为所提方法论的应用,我们证明函数导数的持续同调能够揭示函数中无法通过函数本身持续同调观察到的隐藏结构。同时,我们还说明所提方法可用于比较两个或多个回归曲线的形态,这是仅靠函数本身持续同调无法实现的功能。