Multi-view clustering (MVC) aims to integrate complementary information from multiple views to enhance clustering performance. Late Fusion Multi-View Clustering (LFMVC) has shown promise by synthesizing diverse clustering results into a unified consensus. However, current LFMVC methods struggle with noisy and redundant partitions and often fail to capture high-order correlations across views. To address these limitations, we present a novel theoretical framework for analyzing the generalization error bounds of multiple kernel $k$-means, leveraging local Rademacher complexity and principal eigenvalue proportions. Our analysis establishes a convergence rate of $\mathcal{O}(1/n)$, significantly improving upon the existing rate in the order of $\mathcal{O}(\sqrt{k/n})$. Building on this insight, we propose a low-pass graph filtering strategy within a multiple linear $k$-means framework to mitigate noise and redundancy, further refining the principal eigenvalue proportion and enhancing clustering accuracy. Experimental results on benchmark datasets confirm that our approach outperforms state-of-the-art methods in clustering performance and robustness. The related codes is available at https://github.com/csliangdu/GMLKM .
翻译:多视图聚类(MVC)旨在整合来自多个视图的互补信息以提升聚类性能。晚期融合多视图聚类(LFMVC)通过将多样化的聚类结果综合成一个统一的共识,已显示出良好的前景。然而,当前的LFMVC方法在处理噪声和冗余划分方面存在困难,且往往无法捕捉视图间的高阶相关性。为应对这些局限性,我们提出了一个新颖的理论框架,用于分析多核$k$-均值聚类的泛化误差界,该框架利用了局部Rademacher复杂度和主特征值比例。我们的分析确立了$\mathcal{O}(1/n)$的收敛速率,相较于现有的$\mathcal{O}(\sqrt{k/n})$阶速率有显著提升。基于这一洞见,我们在多线性$k$-均值框架内提出了一种低通图滤波策略,以减轻噪声和冗余,进一步优化主特征值比例并提升聚类精度。在基准数据集上的实验结果证实,我们的方法在聚类性能和鲁棒性方面优于现有最先进的方法。相关代码可在 https://github.com/csliangdu/GMLKM 获取。