Finding the mean of sampled data is a fundamental task in machine learning and statistics. However, in cases where the data samples are graph objects, defining a mean is an inherently difficult task. We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions, where graph similarity can be measured using the Wasserstein metric. By finding a mean in this embedding space, we can recover a mean graph that preserves structural information. We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it. To highlight the potential of our framework as a valuable tool for practical applications in machine learning, it is evaluated on various tasks, including k-means clustering of structured aligned graphs, classification of functional brain networks, and semi-supervised node classification in multi-layer graphs. Our experimental results demonstrate that our approach achieves consistent performance, outperforms existing baseline approaches, and improves the performance of state-of-the-art methods.
翻译:寻找采样数据的均值是机器学习与统计学中的基础任务。然而,当数据样本为图对象时,定义均值本身具有内在困难。我们提出了一种新型框架,通过将图嵌入平滑图信号分布空间来定义图均值,在此空间中可利用Wasserstein度量衡量图相似性。通过在该嵌入空间中寻找均值,可恢复保留结构信息的均值图。我们证明了该新型图均值的存在性与唯一性,并给出了计算该均值的迭代算法。为彰显该框架作为机器学习实际应用重要工具的潜力,我们在多项任务中进行了评估,包括结构化对齐图的k-means聚类、功能性脑网络分类以及多层图的半监督节点分类。实验结果表明,我们的方法性能稳定,优于现有基线方法,并显著提升了当前最优方法的性能。