In multi-view clustering, the quality of different views may vary substantially, and low-quality or degraded views can impair overall clustering performance. However, existing studies mainly address this issue within the clustering process through view weighting or noise-robust optimization, while paying limited attention to data-level assessment before clustering. In this paper, we study the problem of pre-clustering noisy-view analysis in multi-view data from a clusterability perspective. To this end, we propose a Multi-View Clusterability Score (MVCS), which quantifies the strength of latent cluster-related structures in multi-view data through three complementary components: per-view structural clusterability, joint-space clusterability, and cross-view neighborhood consistency. To the best of our knowledge, this is the first clusterability score specifically designed for multi-view data. We further use it to perform potentially noisy view analysis and noisy-view detection before clustering. Extensive experiments on real-world datasets demonstrate that noisy views can significantly degrade clustering performance, and that, compared with existing clusterability measures designed for single-view data, the proposed method more effectively supports noisy-view analysis and detection.
翻译:在多视图聚类中,不同视图的质量可能存在显著差异,低质量或退化视图会损害整体聚类性能。然而,现有研究主要通过视图加权或噪声鲁棒优化在聚类过程中解决该问题,对聚类前的数据级评估关注有限。本文从可聚类性角度研究多视图数据中聚类前噪声视图分析问题。为此,我们提出多视图可聚类性评分(MVCS),通过三个互补分量——单视图结构可聚类性、联合空间可聚类性以及跨视图邻域一致性——量化多视图数据中潜在聚类相关结构的强度。据我们所知,这是首个专门针对多视图数据设计的可聚类性评分。我们进一步将其用于聚类前的潜在噪声视图分析与检测。在真实数据集上的大量实验表明,噪声视图会显著降低聚类性能,且与现有面向单视图数据的可聚类性度量相比,所提方法能更有效地支持噪声视图分析与检测。