Topological Data Analysis (TDA) provides tools to describe the shape of data, but integrating topological features into deep learning pipelines remains challenging, especially when preserving local geometric structure rather than summarizing it globally. We propose a persistence-based data augmentation framework that encodes local gradient flow regions and their hierarchical evolution using the Morse-Smale complex. This representation, compatible with both convolutional and graph neural networks, retains spatially localized topological information across multiple scales. Importantly, the augmentation procedure itself is efficient, with computational complexity $O(n \log n)$, making it practical for large datasets. We evaluate our method on histopathology image classification and 3D porous material regression, where it consistently outperforms baselines and global TDA descriptors such as persistence images and landscapes. We also show that pruning the base level of the hierarchy reduces memory usage while maintaining competitive performance. These results highlight the potential of local, structured topological augmentation for scalable and interpretable learning across data modalities.
翻译:拓扑数据分析(TDA)提供了描述数据形状的工具,但将拓扑特征集成到深度学习流程中仍具挑战性,尤其是在保留局部几何结构而非进行全局概括时。我们提出一种基于持久性的数据增强框架,利用莫尔斯-斯梅尔复形编码局部梯度流区域及其层次演化。该表示兼容卷积神经网络和图神经网络,在多个尺度上保留空间局部的拓扑信息。重要的是,增强过程本身效率较高,计算复杂度为$O(n \log n)$,适用于大规模数据集。我们在组织病理学图像分类和三维多孔材料回归任务上评估该方法,其性能始终优于基线模型及持久性图像和景观等全局TDA描述符。我们还发现,剪枝层次结构的基底层可在保持竞争力性能的同时减少内存使用。这些结果凸显了局部结构化拓扑增强在跨数据模态的可扩展与可解释学习中的潜力。