Bounding the Interleaving Distance for Geometric Graphs with a Loss Function

A geometric graph is an abstract graph along with an embedding of the graph into the Euclidean plane which can be used to model a wide range of data sets. The ability to compare and cluster such objects is required in a data analysis pipeline, leading to a need for distances or metrics on these objects. In this work, we study the interleaving distance on geometric graphs, where functor representations of data can be compared by finding pairs of natural transformations between them. However, in many cases, particularly those of the set-valued functor variety, computation of the interleaving distance is NP-hard. For this reason, we take inspiration from the work of Robinson to find quality measures for families of maps that do not rise to the level of a natural transformation. Specifically, we call collections $\phi = \{\phi_U\mid U\}$ and $\psi = \{\psi_U\mid U\}$ which do not necessarily form a true interleaving an \textit{assignment}. In the case of embedded graphs, we impose a grid structure on the plane, treat this as a poset endowed with the Alexandroff topology $K$, and encode the embedded graph data as functors $F: \mathbf{Open}(K) \to \mathbf{Set}$ where $F(U)$ is the set of connected components of the graph inside of the geometric realization of the set $U$. We then endow the image with the extra structure of a metric space and define a loss function $L(\phi,\psi)$ which measures how far the required diagrams of an interleaving are from commuting. Then for a pair of assignments, we use this loss function to bound the interleaving distance, with an eye toward computation and approximation of the distance. We expect these ideas are not only useful in our particular use case of embedded graphs, but can be extended to a larger class of interleaving distance problems where computational complexity creates a barrier to use in practice.

翻译：几何图是抽象图与欧几里得平面嵌入的结合，可用于建模广泛的数据集。在数据分析流程中，需要对这些对象进行比较和聚类，因此需要定义合适的距离或度量。本文研究几何图上的交错距离，通过寻找数据函子表示之间的自然变换对来实现比较。然而，在许多情况下，特别是集值函子变体中，交错距离的计算是NP难的。为此，我们借鉴Robinson的工作，为未达到自然变换层次的映射族寻找质量度量。具体而言，我们将不一定构成真正交错的集合族$\phi = \{\phi_U\mid U\}$和$\psi = \{\psi_U\mid U\}$称为**分配**。对于嵌入图的情况，我们在平面上施加网格结构，将其视为赋有亚历山德罗夫拓扑$K$的偏序集，并将嵌入图数据编码为函子$F: \mathbf{Open}(K) \to \mathbf{Set}$，其中$F(U)$是图在集合$U$的几何实现内部的连通分量集。随后，我们赋予该像度量空间的额外结构，并定义损失函数$L(\phi,\psi)$以度量交错所需图的可交换性偏离程度。针对一对分配，我们利用该损失函数界定交错距离，旨在实现距离的计算与近似。我们预期这些思想不仅适用于嵌入图这一特定场景，还可推广至因计算复杂性而难以实际应用的一类更广泛的交错距离问题。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日