Deep multi-view subspace clustering (DMVSC) has recently attracted increasing attention due to its promising performance. However, existing DMVSC methods still have two issues: (1) they mainly focus on using autoencoders to nonlinearly embed the data, while the embedding may be suboptimal for clustering because the clustering objective is rarely considered in autoencoders, and (2) existing methods typically have a quadratic or even cubic complexity, which makes it challenging to deal with large-scale data. To address these issues, in this paper we propose a novel deep multi-view subspace clustering method with anchor graph (DMCAG). To be specific, DMCAG firstly learns the embedded features for each view independently, which are used to obtain the subspace representations. To significantly reduce the complexity, we construct an anchor graph with small size for each view. Then, spectral clustering is performed on an integrated anchor graph to obtain pseudo-labels. To overcome the negative impact caused by suboptimal embedded features, we use pseudo-labels to refine the embedding process to make it more suitable for the clustering task. Pseudo-labels and embedded features are updated alternately. Furthermore, we design a strategy to keep the consistency of the labels based on contrastive learning to enhance the clustering performance. Empirical studies on real-world datasets show that our method achieves superior clustering performance over other state-of-the-art methods.
翻译:深度多视角子空间聚类方法(DMVSC)近年来因其优越性能而备受关注。然而,现有DMVSC方法仍存在两个问题:(1)主要依赖自编码器对数据进行非线性嵌入,但自编码器很少考虑聚类目标,导致嵌入结果可能不适用于聚类;(2)现有方法通常具有二次甚至三次复杂度,难以处理大规模数据。针对这些问题,本文提出了一种基于锚点图的新型深度多视角子空间聚类方法(DMCAG)。具体而言,DMCAG首先为每个视角独立学习嵌入特征,并基于这些特征获取子空间表示。为显著降低复杂度,我们为每个视角构建了小规模锚点图。随后,对融合后的锚点图进行谱聚类以生成伪标签。为克服次优嵌入特征带来的负面影响,我们利用伪标签优化嵌入过程,使其更适应聚类任务。伪标签与嵌入特征交替更新。此外,我们设计了一种基于对比学习的标签一致性保持策略,以增强聚类性能。在真实数据集上的实验表明,我们的方法在聚类性能上优于其他先进方法。