The computation of Vietoris-Rips persistence barcodes is both execution-intensive and memory-intensive. In this paper, we study its computational structure and identify several unique mathematical properties and algorithmic opportunities with connections to the GPU. Mathematically and empirically, we look into the properties of apparent pairs, which are independently identifiable persistence pairs comprising up to 99\% of persistence pairs. We prove tight upper and lower bounds of the apparent pair rate and some probabilistic lower bounds. We also design massively parallel algorithms to take advantage of the very large number of simplices that can be processed independently of each other. Having identified these opportunities, we develop a GPU-accelerated software for computing Vietoris-Rips persistence barcodes, called Ripser++. Under nice sampling conditions, we show that the expected work complexity of our algorithm is near linear in the number of simplices. The expected depth complexity is dependent only on the computation of the expected number of $p$-dimensional homological cycles. The software achieves up to 30x speedup over the total execution time of the original Ripser and also reduces CPU-memory usage by up to 2.0x. We believe our GPU-acceleration based efforts open a new chapter for the advancement of topological data analysis in the post-Moore's Law era.
翻译:Vietoris-Rips持续同调条码的计算既耗费计算资源又占用大量内存。本文通过分析其计算结构,揭示了若干与GPU架构相关的独特数学性质与算法优化机会。我们从数学与实证角度研究表观对(独立可识别的持续同调对,其占比可达持续同调对的99%),证明了表观对比例的严格上界与下界及其概率下界。同时设计了大规模并行算法,以充分利用可独立处理的超大规模单形集合。基于这些发现,我们开发了名为Ripser++的GPU加速软件,用于计算Vietoris-Rips持续同调条码。在良好采样条件下,我们算法的期望工作复杂度接近单形数量的线性阶,期望深度复杂度仅依赖于$p$维同调环数量的期望计算量。该软件相较于原始Ripser实现了最高30倍的总执行时间加速,同时将CPU内存使用量降低至原来的2.0倍。我们相信,基于GPU加速的这项工作为后摩尔时代拓扑数据分析的发展开启了新篇章。