We characterize structures such as monotonicity, convexity, and modality in smooth regression curves using persistent homology. Persistent homology is a key tool in topological data analysis that detects higher dimensional topological features such as connected components and holes (cycles or loops) in the data. In other words, persistent homology is a multiscale version of homology that characterizes sets based on the connected components and holes. We use super-level sets of functions to extract geometric features via persistent homology. In particular, we explore structures in regression curves via the persistent homology of super-level sets of a function, where the function of interest is - the first derivative of the regression function. In the course of this study, we extend an existing procedure of estimating the persistent homology for the first derivative of a regression function and establish its consistency. Moreover, as an application of the proposed methodology, we demonstrate that the persistent homology of the derivative of a function can reveal hidden structures in the function that are not visible from the persistent homology of the function itself. In addition, we also illustrate that the proposed procedure can be used to compare the shapes of two or more regression curves which is not possible merely from the persistent homology of the function itself.
翻译:我们利用持续同调对光滑回归曲线中的单调性、凸性及模态等结构进行特征化。持续同调是拓扑数据分析中的关键工具,能够检测数据中更高维度的拓扑特征,例如连通分量和空洞(圈或环)。换言之,持续同调是同调的多尺度版本,通过连通分量和空洞对集合进行特征化。我们采用函数的超水平集,借助持续同调提取几何特征。具体而言,通过函数超水平集的持续同调探索回归曲线中的结构,其中所关注的函数为回归函数的一阶导数。在本研究中,我们对回归函数一阶导数的持续同调估计现有流程进行了扩展,并建立了其一致性。此外,作为所提方法论的应用,我们证明函数导数的持续同调能够揭示函数本身持续同调无法呈现的隐藏结构。同时,我们还展示了所提方法可用于比较两个或多个回归曲线的形态,而这仅凭函数本身的持续同调是无法实现的。