Mapper graphs are widely used tools in topological data analysis and visualization. They can be understood as discrete approximations of Reeb graphs, providing insight into the shape and connectivity of complex data. Given a high-dimensional point cloud together with a real-valued function defined on it, a mapper graph summarizes the induced topological structure: each node represents a local neighborhood, and edges connect nodes whose corresponding neighborhoods overlap. Our focus is the interleaving distance for mapper graphs, arising as a discretized analogue of the interleaving distance for Reeb graphs-a quantity known to be NP-hard to compute. This distance measures how similar two mapper graphs are by quantifying how much they must be ``stretched'' to be made comparable. Recent work introduced a loss function that gives an upper bound on this distance. The loss evaluates how far a given collection of maps, called an assignment, is from being a true interleaving. Importantly, it is computationally tractable, offering a practical way to bound the distance, however the quality of the bound is dependent on the choice of assignment. In this paper, we develop the first framework for bounding the interleaving distance on mapper graphs. We present the bound in two ways: first, by formulating an integer linear program (ILP) that determines whether an $n$-interleaving exists for a given $n$; and second, by constructing an ILP that identifies an assignment with minimal loss for that $n$. We also evaluate the method on small examples where the interleaving distance is known, and on benchmark and simulated datasets, demonstrating the utility of the approach for classification tasks based on mapper graphs.
翻译:Mapper图是拓扑数据分析与可视化中广泛使用的工具,可理解为Reeb图的离散近似,用于揭示复杂数据的形状与连通性。给定一个高维点云及其上定义的实值函数,Mapper图归纳出诱导的拓扑结构:每个节点代表一个局部邻域,边连接对应邻域有重叠的节点。本文聚焦Mapper图的交织距离——作为Reeb图交织距离的离散化模拟,而Reeb图交织距离已知为NP难问题。该距离通过量化两个Mapper图需被“拉伸”以实现可比性的程度来衡量其相似性。近期工作引入了一个损失函数,为该距离提供上界,通过评估一组称为“分配”的映射集合离真实交织的差距来界定距离。重要的是,该损失在计算上可行,提供了实用的距离界定方法,但界的好坏取决于分配的选择。本文首次提出在Mapper图上界定交织距离的框架,通过两种方式呈现该界:其一,构建整数线性规划(ILP)判定给定$n$是否存在$n$-交织;其二,构建另一ILP,寻找对应该$n$具有最小损失的分配。我们在已知交织距离的小规模示例、基准数据集及模拟数据集上评估该方法,展示了其在基于Mapper图的分类任务中的实用性。