Graph clustering, which involves the partitioning of nodes within a graph into disjoint clusters, holds significant importance for numerous subsequent applications. Recently, contrastive learning, known for utilizing supervisory information, has demonstrated encouraging results in deep graph clustering. This methodology facilitates the learning of favorable node representations for clustering by attracting positively correlated node pairs and distancing negatively correlated pairs within the representation space. Nevertheless, a significant limitation of existing methods is their inadequacy in thoroughly exploring node-wise similarity. For instance, some hypothesize that the node similarity matrix within the representation space is identical, ignoring the inherent semantic relationships among nodes. Given the fundamental role of instance similarity in clustering, our research investigates contrastive graph clustering from the perspective of the node similarity matrix. We argue that an ideal node similarity matrix within the representation space should accurately reflect the inherent semantic relationships among nodes, ensuring the preservation of semantic similarities in the learned representations. In response to this, we introduce a new framework, Reliable Node Similarity Matrix Guided Contrastive Graph Clustering (NS4GC), which estimates an approximately ideal node similarity matrix within the representation space to guide representation learning. Our method introduces node-neighbor alignment and semantic-aware sparsification, ensuring the node similarity matrix is both accurate and efficiently sparse. Comprehensive experiments conducted on $8$ real-world datasets affirm the efficacy of learning the node similarity matrix and the superior performance of NS4GC.
翻译:图聚类旨在将图中的节点划分为互不相交的簇,对众多后续应用具有重要意义。近年来,以利用监督信息著称的对比学习在深度图聚类中展现出令人鼓舞的成果。该方法通过在表示空间中拉近正相关节点对并推远负相关节点对,有助于学习适用于聚类的优质节点表示。然而,现有方法的一个显著局限在于未能充分挖掘节点间的相似性。例如,某些方法假设表示空间内的节点相似度矩阵是恒等的,忽略了节点间固有的语义关联。鉴于实例相似性在聚类中的基础性作用,本研究从节点相似度矩阵的视角探究对比图聚类。我们认为,表示空间中的理想节点相似度矩阵应能准确反映节点间固有的语义关系,确保学习到的表示能保留语义相似性。为此,我们提出一个新框架——可靠节点相似度矩阵引导的对比图聚类(NS4GC),该框架通过在表示空间中估计一个近似理想的节点相似度矩阵来指导表示学习。我们的方法引入了节点-邻居对齐机制与语义感知稀疏化策略,确保节点相似度矩阵既准确又高效稀疏。在 $8$ 个真实世界数据集上进行的全面实验证实了学习节点相似度矩阵的有效性以及 NS4GC 的优越性能。