Multi-view Clustering (MVC) has achieved significant progress, with many efforts dedicated to learn knowledge from multiple views. However, most existing methods are either not applicable or require additional steps for incomplete MVC. Such a limitation results in poor-quality clustering performance and poor missing view adaptation. Besides, noise or outliers might significantly degrade the overall clustering performance, which are not handled well by most existing methods. In this paper, we propose a novel unified framework for incomplete and complete MVC named self-learning symmetric multi-view probabilistic clustering (SLS-MPC). SLS-MPC proposes a novel symmetric multi-view probability estimation and equivalently transforms multi-view pairwise posterior matching probability into composition of each view's individual distribution, which tolerates data missing and might extend to any number of views. Then, SLS-MPC proposes a novel self-learning probability function without any prior knowledge and hyper-parameters to learn each view's individual distribution. Next, graph-context-aware refinement with path propagation and co-neighbor propagation is used to refine pairwise probability, which alleviates the impact of noise and outliers. Finally, SLS-MPC proposes a probabilistic clustering algorithm to adjust clustering assignments by maximizing the joint probability iteratively without category information. Extensive experiments on multiple benchmarks show that SLS-MPC outperforms previous state-of-the-art methods.
翻译:多视图聚类(MVC)取得了显著进展,许多研究致力于从多个视图中学习知识。然而,现有方法大多不适用于不完整MVC,或需要额外步骤才能处理。这一局限性导致聚类性能质量低下且缺失视图适应性差。此外,噪声或异常值可能严重降低整体聚类性能,而现有方法对此处理不足。本文提出一种针对不完整和完整MVC的新型统一框架——自学习对称多视图概率聚类(SLS-MPC)。SLS-MPC提出了一种新颖的对称多视图概率估计方法,将多视图成对后验匹配概率等价转换为各视图独立分布的合成,从而容忍数据缺失,并可扩展至任意数量的视图。随后,SLS-MPC提出一种无需先验知识和超参数的新型自学习概率函数,用于学习各视图的独立分布。接着,采用基于路径传播和共邻居传播的图上下文感知精化方法优化成对概率,减轻噪声和异常值的影响。最后,SLS-MPC提出一种概率聚类算法,通过迭代最大化联合概率来调整聚类分配,无需类别信息。在多个基准数据集上的广泛实验表明,SLS-MPC优于现有最先进方法。