Multi-view Clustering (MVC) has achieved significant progress, with many efforts dedicated to learn knowledge from multiple views. However, most existing methods are either not applicable or require additional steps for incomplete MVC. Such a limitation results in poor-quality clustering performance and poor missing view adaptation. Besides, noise or outliers might significantly degrade the overall clustering performance, which are not handled well by most existing methods. In this paper, we propose a novel unified framework for incomplete and complete MVC named self-learning symmetric multi-view probabilistic clustering (SLS-MPC). SLS-MPC proposes a novel symmetric multi-view probability estimation and equivalently transforms multi-view pairwise posterior matching probability into composition of each view's individual distribution, which tolerates data missing and might extend to any number of views. Then, SLS-MPC proposes a novel self-learning probability function without any prior knowledge and hyper-parameters to learn each view's individual distribution. Next, graph-context-aware refinement with path propagation and co-neighbor propagation is used to refine pairwise probability, which alleviates the impact of noise and outliers. Finally, SLS-MPC proposes a probabilistic clustering algorithm to adjust clustering assignments by maximizing the joint probability iteratively without category information. Extensive experiments on multiple benchmarks show that SLS-MPC outperforms previous state-of-the-art methods.
翻译:多视图聚类已取得显著进展,众多研究致力于从多视图中学习知识。然而,现有方法大多不适用于不完整多视图聚类,或需要额外处理步骤。这一局限导致聚类质量不佳且缺失视图适应能力差。此外,噪声或异常值可能严重降低整体聚类性能,而现有方法大多未能妥善处理。本文提出一种适用于完整与不完整多视图聚类的新型统一框架——自学习对称多视图概率聚类。该框架提出创新的对称多视图概率估计方法,将多视图成对后验匹配概率等价转化为各视图独立分布的复合形式,从而容忍数据缺失并可扩展至任意数量视图。随后,框架提出无需先验知识与超参数的自学习概率函数,用以学习各视图的独立分布。接着,采用基于路径传播与共邻传播的图上下文感知细化机制来优化成对概率,以减轻噪声与异常值的影响。最后,框架提出概率聚类算法,通过迭代最大化联合概率来调整聚类分配,且无需类别信息。在多个基准数据集上的大量实验表明,本方法优于现有最先进方法。