Bounding the Interleaving Distance for Mapper Graphs with a Loss Function

Data consisting of a graph with a function to $\mathbb{R}^d$ arise in many data applications, encompassing structures such as Reeb graphs, geometric graphs, and knot embeddings. As such, the ability to compare and cluster such objects is required in a data analysis pipeline, leading to a need for distances or metrics between them. In this work, we study the interleaving distance on discretizations of these objects, $\mathbb{R}^d$-mapper graphs, where functor representations of the data can be compared by finding pairs of natural transformations between them. However, in many cases, computation of the interleaving distance is NP-hard. For this reason, we take inspiration from the work of Robinson to find quality measures for families of maps that do not rise to the level of a natural transformation, called assignments. We then endow the functor images with the extra structure of a metric space and define a loss function which measures how far an assignment is from making the required diagrams of an interleaving commute. Finally we show that the computation of the loss function is polynomial. We believe this idea is both powerful and translatable, with the potential to be used for approximation and bounds on interleavings in a broad array of contexts.

翻译：由带有到$\mathbb{R}^d$函数的图构成的数据广泛出现在各类数据应用中，涵盖Reeb图、几何图以及纽结嵌入等结构。因此，在数据分析流程中需要具备比较和聚类这些对象的能力，进而催生了定义这些对象间距离或度量的需求。本研究聚焦于这些对象的离散化表示——$\mathbb{R}^d\)-Mapper图上的交错距离，通过寻找函子表示之间的自然变换对来比较数据。然而，在许多情况下，交错距离的计算是NP-hard问题。为此，我们借鉴Robinson的研究思路，为无法构成自然变换的映射族（称为指派）建立质量度量。随后为函子像赋予度量空间的额外结构，并定义一个损失函数来度量指派与实现交错图交换所需条件的偏差程度。最终证明该损失函数的计算具有多项式时间复杂度。我们认为该思想兼具普适性与可迁移性，有望在广泛场景中用于交错距离的近似计算与界限估计。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日