Most existing graph clustering methods primarily focus on exploiting topological structure, often neglecting the ``missing-half" node feature information, especially how these features can enhance clustering performance. This issue is further compounded by the challenges associated with high-dimensional features. Feature selection in graph clustering is particularly difficult because it requires simultaneously discovering clusters and identifying the relevant features for these clusters. To address this gap, we introduce a novel paradigm called ``one node one model", which builds an exclusive model for each node and defines the node label as a combination of predictions for node groups. Specifically, the proposed ``Feature Personalized Graph Clustering (FPGC)" method identifies cluster-relevant features for each node using a squeeze-and-excitation block, integrating these features into each model to form the final representations. Additionally, the concept of feature cross is developed as a data augmentation technique to learn low-order feature interactions. Extensive experimental results demonstrate that FPGC outperforms state-of-the-art clustering methods. Moreover, the plug-and-play nature of our method provides a versatile solution to enhance GNN-based models from a feature perspective.
翻译:现有的大多数图聚类方法主要聚焦于利用拓扑结构,往往忽视了"缺失半部"的节点特征信息,特别是这些特征如何提升聚类性能。高维特征带来的挑战进一步加剧了这一问题。图聚类中的特征选择尤为困难,因为它需要同时发现聚类簇并识别与这些簇相关的特征。为填补这一空白,我们提出了一种称为"单节点单模型"的新范式,该范式为每个节点构建专属模型,并将节点标签定义为节点组预测结果的组合。具体而言,所提出的"特征个性化图聚类(FPGC)"方法通过压缩激励模块为每个节点识别与聚类相关的特征,并将这些特征整合到每个模型中形成最终表示。此外,我们提出了特征交叉的概念作为数据增强技术,以学习低阶特征交互。大量实验结果表明,FPGC优于当前最先进的聚类方法。此外,本方法的即插即用特性为从特征角度增强基于图神经网络(GNN)的模型提供了通用解决方案。