Some key issues in robust clustering are discussed with focus on Gaussian mixture model based clustering, namely the formal definition of outliers, ambiguity between groups of outliers and clusters, the interaction between robust clustering and the estimation of the number of clusters, the essential dependence of (not only) robust clustering on tuning decisions, and shortcomings of existing measurements of cluster stability when it comes to outliers.
翻译:本文讨论了鲁棒聚类中的若干关键问题,重点关注基于高斯混合模型的聚类方法,包括离群点的形式化定义、离群点群与聚类之间的模糊性、鲁棒聚类与聚类数量估计之间的相互作用、鲁棒聚类(不仅限于此)对调参决策的本质依赖性,以及现有聚类稳定性度量在处理离群点时的不足。