Persistent homology is a central tool in topological data analysis, but its application to large and noisy datasets is often limited by computational cost and the presence of spurious topological features. Noise not only increases data size but also obscures the underlying structure of the data. In this paper, we propose the Refined Characteristic Lattice Algorithm (RCLA), a grid-based method that integrates data reduction with threshold-based denoising in a single procedure. By incorporating a threshold parameter $k$, RCLA removes noise while preserving the essential structure of the data in a single pass. We further provide a theoretical guarantee by proving a stability theorem under a homogeneous Poisson noise model, which bounds the bottleneck distance between the persistence diagrams of the output and the underlying shape with high probability. In addition, we introduce an automatic parameter selection method based on nearest-neighbor statistics. Experimental results demonstrate that RCLA consistently outperforms existing methods, and its effectiveness is further validated on a 3D shape classification task.
翻译:持续同调是拓扑数据分析中的核心工具,但在处理大规模含噪数据集时,常因计算成本高昂和虚假拓扑特征的干扰而受到限制。噪声不仅增加了数据规模,还掩盖了数据的底层结构。本文提出精化特征格算法(RCLA),这是一种基于网格的方法,将数据约简与阈值化去噪整合至同一流程中。通过引入阈值参数$k$,RCLA在单次扫描中去除噪声的同时保持数据本质结构。我们进一步在齐次泊松噪声模型下证明了稳定性定理,以高概率约束输出与底层形状的持续图之间的瓶颈距离,从而提供理论保障。此外,我们引入基于最近邻统计的自动参数选择方法。实验结果表明,RCLA始终优于现有方法,其在三维形状分类任务上的有效性亦得到进一步验证。