Finding the mean of sampled data is a fundamental task in machine learning and statistics. However, in cases where the data samples are graph objects, defining a mean is an inherently difficult task. We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions, where graph similarity can be measured using the Wasserstein metric. By finding a mean in this embedding space, we can recover a mean graph that preserves structural information. We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it. To highlight the potential of our framework as a valuable tool for practical applications in machine learning, it is evaluated on various tasks, including k-means clustering of structured graphs, classification of functional brain networks, and semi-supervised node classification in multi-layer graphs. Our experimental results demonstrate that our approach achieves consistent performance, outperforms existing baseline approaches, and improves state-of-the-art methods.
翻译:对采样数据求均值是机器学习和统计学中的基本任务。然而,当数据样本为图对象时,定义均值本质上是一项困难的任务。我们提出了一种新颖的框架,通过将图嵌入到光滑图信号分布空间中进行均值定义,在该空间中可利用Wasserstein度量衡量图的相似性。通过在该嵌入空间中求取均值,我们能够恢复保留结构信息的均值图。我们建立了这种新型图均值的存在性与唯一性,并给出了其迭代计算方法。为突显该框架作为机器学习实用工具的潜力,我们在多个任务上进行了评估,包括结构化图的k-means聚类、功能性脑网络分类以及多层图中的半监督节点分类。实验结果表明,我们的方法性能一致,优于现有基线方法,并改进了当前最先进的技术水平。