Multiview clustering (MVC) aims to reveal the underlying structure of multiview data by categorizing data samples into clusters. Deep learning-based methods exhibit strong feature learning capabilities on large-scale datasets. For most existing deep MVC methods, exploring the invariant representations of multiple views is still an intractable problem. In this paper, we propose a cross-view contrastive learning (CVCL) method that learns view-invariant representations and produces clustering results by contrasting the cluster assignments among multiple views. Specifically, we first employ deep autoencoders to extract view-dependent features in the pretraining stage. Then, a cluster-level CVCL strategy is presented to explore consistent semantic label information among the multiple views in the fine-tuning stage. Thus, the proposed CVCL method is able to produce more discriminative cluster assignments by virtue of this learning strategy. Moreover, we provide a theoretical analysis of soft cluster assignment alignment. Extensive experimental results obtained on several datasets demonstrate that the proposed CVCL method outperforms several state-of-the-art approaches.
翻译:多视图聚类旨在通过将数据样本划分为聚类来揭示多视图数据的潜在结构。基于深度学习的方法在大规模数据集上展现出强大的特征学习能力。对于大多数现有的深度多视图聚类方法而言,探索多个视图的不变表示仍然是一个棘手的问题。本文提出了一种跨视图对比学习方法,该方法通过学习视图不变表示,并通过对比多个视图间的聚类分配来生成聚类结果。具体而言,我们首先在预训练阶段采用深度自编码器提取视图相关的特征。随后,在微调阶段提出了一种聚类级别的跨视图对比学习策略,以探索多个视图间一致的语义标签信息。因此,所提出的CVCL方法能够凭借这种学习策略生成更具判别性的聚类分配。此外,我们对软聚类分配的对齐进行了理论分析。在多个数据集上的广泛实验结果表明,所提出的CVCL方法优于几种最先进的方法。