Graph Neural Networks (GNNs) have achieved state-of-the-art results on many graph analysis tasks such as node classification and link prediction. However, important unsupervised problems on graphs, such as graph clustering, have proved more resistant to advances in GNNs. Graph clustering has the same overall goal as node pooling in GNNs - does this mean that GNN pooling methods do a good job at clustering graphs? Surprisingly, the answer is no - current GNN pooling methods often fail to recover the cluster structure in cases where simple baselines, such as k-means applied on learned representations, work well. We investigate further by carefully designing a set of experiments to study different signal-to-noise scenarios both in graph structure and attribute data. To address these methods' poor performance in clustering, we introduce Deep Modularity Networks (DMoN), an unsupervised pooling method inspired by the modularity measure of clustering quality, and show how it tackles recovery of the challenging clustering structure of real-world graphs. Similarly, on real-world data, we show that DMoN produces high quality clusters which correlate strongly with ground truth labels, achieving state-of-the-art results with over 40% improvement over other pooling methods across different metrics.
翻译:图神经网络(GNNs)已在节点分类和链接预测等多项图分析任务中取得最先进成果。然而,图聚类等重要的无监督图问题却相对难以通过GNN实现突破。图聚类与GNN中的节点池化具有相同目标——这是否意味着GNN池化方法能有效实现图聚类?令人惊讶的是,答案是否定的:在简单基线方法(如基于学习表示的k-means聚类)表现良好的场景下,当前GNN池化方法往往无法恢复聚类结构。我们通过精心设计系列实验,系统研究了图结构与属性数据中不同信噪比场景的影响。针对这些方法在聚类任务中的低效表现,我们提出深度模块化网络(DMoN),这是一种受聚类质量模块度度量启发的无监督池化方法,并论证其如何解决真实世界图中复杂聚类结构的恢复问题。同样在真实数据上,我们证明DMoN生成的聚类结果与地面真值标签高度相关,在不同评估指标上均以超过其他池化方法40%的改进幅度达到最先进水平。