Graph clustering is an important unsupervised learning technique for partitioning graphs with attributes and detecting communities. However, current methods struggle to accurately capture true community structures and intra-cluster relations, be computationally efficient, and identify smaller communities. We address these challenges by integrating coarsening and modularity maximization, effectively leveraging both adjacency and node features to enhance clustering accuracy. We propose a loss function incorporating log-determinant, smoothness, and modularity components using a block majorization-minimization technique, resulting in superior clustering outcomes. The method is theoretically consistent under the Degree-Corrected Stochastic Block Model (DC-SBM), ensuring asymptotic error-free performance and complete label recovery. Our provably convergent and time-efficient algorithm seamlessly integrates with graph neural networks (GNNs) and variational graph autoencoders (VGAEs) to learn enhanced node features and deliver exceptional clustering performance. Extensive experiments on benchmark datasets demonstrate its superiority over existing state-of-the-art methods for both attributed and non-attributed graphs.
翻译:图聚类是一种重要的无监督学习技术,用于划分带属性的图并检测社区。然而,现有方法难以准确捕捉真实的社区结构和簇内关系,计算效率不高,且难以识别较小的社区。我们通过整合粗化与模块度最大化来解决这些挑战,有效利用邻接关系和节点特征以提升聚类精度。我们提出了一种包含对数行列式、平滑性和模块度分量的损失函数,并采用块坐标下降技术进行优化,从而获得更优的聚类结果。该方法在度校正随机块模型(DC-SBM)下具有理论一致性,确保渐近无误差的性能和完整的标签恢复。我们提出的算法具有可证明的收敛性和时间效率,并能与图神经网络(GNNs)和变分图自编码器(VGAEs)无缝集成,以学习增强的节点特征并提供卓越的聚类性能。在基准数据集上的大量实验表明,该方法在属性图和非属性图上的聚类性能均优于现有的先进方法。