Exploring the complementary information of multi-view data to improve clustering effects is a crucial issue in multi-view clustering. In this paper, we propose a novel model based on information theory termed Informative Multi-View Clustering (IMVC), which extracts the common and view-specific information hidden in multi-view data and constructs a clustering-oriented comprehensive representation. More specifically, we concatenate multiple features into a unified feature representation, then pass it through a encoder to retrieve the common representation across views. Simultaneously, the features of each view are sent to a encoder to produce a compact view-specific representation, respectively. Thus, we constrain the mutual information between the common representation and view-specific representations to be minimal for obtaining multi-level information. Further, the common representation and view-specific representation are spliced to model the refined representation of each view, which is fed into a decoder to reconstruct the initial data with maximizing their mutual information. In order to form a comprehensive representation, the common representation and all view-specific representations are concatenated. Furthermore, to accommodate the comprehensive representation better for the clustering task, we maximize the mutual information between an instance and its k-nearest neighbors to enhance the intra-cluster aggregation, thus inducing well separation of different clusters at the overall aspect. Finally, we conduct extensive experiments on six benchmark datasets, and the experimental results indicate that the proposed IMVC outperforms other methods.
翻译:探索多视图数据的互补信息以提升聚类效果是多视图聚类中的关键问题。本文提出一种基于信息理论的新型模型,称为信息性多视图聚类(IMVC),该模型提取多视图数据中隐藏的公共信息与视图特有信息,并构建面向聚类的综合表示。具体而言,我们将多个特征拼接为统一特征表示,随后通过编码器获取跨视图的公共表示。同时,每个视图的特征分别输入编码器,生成紧凑的视图特有表示。在此基础上,我们约束公共表示与视图特有表示之间的互信息最小化,以获取多层次信息。进一步地,将公共表示与视图特有表示拼接以建模每个视图的精炼表示,并输入解码器,通过最大化互信息重构初始数据。为形成综合表示,我们将公共表示与所有视图特有表示进行拼接。此外,为更好适配聚类任务,我们最大化实例与其k近邻之间的互信息以增强簇内聚合性,从而从整体层面诱导不同簇的分离。最后,我们在六个基准数据集上开展广泛实验,结果表明所提出的IMVC方法优于其他方法。