Distance measures between graphs are important primitives for a variety of learning tasks. In this work, we describe an unsupervised, optimal transport based approach to define a distance between graphs. Our idea is to derive representations of graphs as Gaussian mixture models, fitted to distributions of sampled node embeddings over the same space. The Wasserstein distance between these Gaussian mixture distributions then yields an interpretable and easily computable distance measure, which can further be tailored for the comparison at hand by choosing appropriate embeddings. We propose two embeddings for this framework and show that under certain assumptions about the shape of the resulting Gaussian mixture components, further computational improvements of this Wasserstein distance can be achieved. An empirical validation of our findings on synthetic data and real-world Functional Brain Connectivity networks shows promising performance compared to existing embedding methods.
翻译:图之间的距离度量是多种学习任务中的重要基础。本文提出了一种基于最优传输的无监督方法,用于定义图之间的距离。我们的核心思想是将图表示为高斯混合模型,该模型通过在同一空间中对采样的节点嵌入分布进行拟合得到。这些高斯混合分布之间的Wasserstein距离提供了一种可解释且易于计算的度量方法,并可通过选择合适的嵌入来针对特定比较任务进行定制。我们为此框架提出了两种嵌入方案,并证明在关于所得高斯混合分量形状的特定假设下,可以进一步优化该Wasserstein距离的计算效率。在合成数据与真实世界功能性脑连接网络上的实证验证表明,与现有嵌入方法相比,该方法具有有竞争力的性能。