Concept Factorization (CF), as a novel paradigm of representation learning, has demonstrated superior performance in multi-view clustering tasks. It overcomes limitations such as the non-negativity constraint imposed by traditional matrix factorization methods and leverages kernel methods to learn latent representations that capture the underlying structure of the data, thereby improving data representation. However, existing multi-view concept factorization methods fail to consider the limited labeled information inherent in real-world multi-view data. This often leads to significant performance loss. To overcome these limitations, we propose a novel semi-supervised multi-view concept factorization model, named SMVCF. In the SMVCF model, we first extend the conventional single-view CF to a multi-view version, enabling more effective exploration of complementary information across multiple views. We then integrate multi-view CF, label propagation, and manifold learning into a unified framework to leverage and incorporate valuable information present in the data. Additionally, an adaptive weight vector is introduced to balance the importance of different views in the clustering process. We further develop targeted optimization methods specifically tailored for the SMVCF model. Finally, we conduct extensive experiments on four diverse datasets with varying label ratios to evaluate the performance of SMVCF. The experimental results demonstrate the effectiveness and superiority of our proposed approach in multi-view clustering tasks.
翻译:概念分解(CF)作为一种新型表示学习范式,在多视角聚类任务中展现出优越性能。它克服了传统矩阵分解方法中非负性约束等限制,并利用核方法学习能够捕捉数据潜在结构的隐表示,从而提升数据表征能力。然而,现有多视角概念分解方法未能考虑真实多视角数据中蕴含的有限标签信息,这往往导致显著的性能损失。针对这些局限性,我们提出一种新型半监督多视角概念分解模型SMVCF。在该模型中,我们首先将传统单视角CF扩展为多视角版本,以更有效地挖掘跨视角互补信息;继而将多视角CF、标签传播与流形学习整合至统一框架中,充分挖掘并融合数据中的有价值信息;此外引入自适应权重向量,用于平衡不同视角在聚类过程中的重要性。我们进一步针对SMVCF模型开发了专门的优化方法。最后,在四个不同标签比例的多样化数据集上开展广泛实验以评估SMVCF性能。实验结果表明,所提方法在多视角聚类任务中具有有效性与优越性。