Box Filtration - 专知论文

We define a new framework that unifies the filtration and mapper approaches from TDA, and present efficient algorithms to compute it. Termed the box filtration of a PCD, we grow boxes (hyperrectangles) that are not necessarily centered at each point (in place of balls centered at points). We grow the boxes non-uniformly and asymmetrically in different dimensions based on the distribution of points. We present two approaches to handle the boxes: a point cover where each point is assigned its own box at start, and a pixel cover that works with a pixelization of the space of the PCD. Any box cover in either setting automatically gives a mapper of the PCD. We show that the persistence diagrams generated by the box filtration using both point and pixel covers satisfy the classical stability based on the Gromov-Hausdorff distance. Using boxes also implies that the box filtration is identical for pairwise or higher order intersections whereas the VR and Cech filtration are not the same. Growth in each dimension is computed by solving a linear program (LP) that optimizes a cost functional balancing the cost of expansion and benefit of including more points in the box. The box filtration algorithm runs in $O(m|U(0)|\log(mn\pi)L(q))$ time, where $m$ is number of steps of increments considered for box growth, $|U(0)|$ is the number of boxes in the initial cover ($\leq$ number of points), $\pi$ is the step length for increasing each box dimension, each LP is solved in $O(L(q))$ time, $n$ is the PCD dimension, and $q = n \times |X|$. We demonstrate through multiple examples that the box filtration can produce more accurate results to summarize the topology of the PCD than VR and distance-to-measure (DTM) filtrations. Software for our implementation is available at https://github.com/pragup/Box-Filteration.

翻译：本文提出一个新框架，统一了拓扑数据分析（TDA）中的过滤与映射方法，并给出了高效计算算法。该框架称为点云数据的盒过滤：我们生长不必然以每个点为中心的方盒（超矩形）（替代以点为中心的球体），并基于点分布在不同维度上以非均匀、非对称的方式扩展方盒。我们提出两种方盒处理方案：点覆盖（每个点初始分配独立方盒）和像素覆盖（基于点云空间像素化操作）。两种设置下的任何方盒覆盖均自动生成点云数据的映射图。我们证明，基于点覆盖与像素覆盖的盒过滤生成的持续图满足基于格罗莫夫-豪斯多夫距离的经典稳定性。使用方盒还意味着盒过滤在成对或高阶交集中保持一致性，而VR过滤和Čech过滤则不具备此性质。每个维度的生长通过求解线性规划（LP）实现，该LP优化平衡扩展成本与包含更多点收益的代价泛函。盒过滤算法的时间复杂度为$O(m|U(0)|\log(mn\pi)L(q))$，其中$m$为盒生长增量步数，$|U(0)|$为初始覆盖中的方盒数（≤点数），$\pi$为各盒维度增长步长，每个LP求解时间为$O(L(q))$，$n$为点云维度，$q=n×|X|$。通过多例验证，盒过滤比VR过滤和距离测度（DTM）过滤能更精确地归纳点云拓扑结构。算法实现软件见https://github.com/pragup/Box-Filteration。