Silhouette coefficient is an established internal clustering evaluation measure that produces a score per data point, assessing the quality of its clustering assignment. To assess the quality of the clustering of the whole dataset, the scores of all the points in the dataset are either (micro) averaged into a single value or averaged at the cluster level and then (macro) averaged. As we illustrate in this work, by using a synthetic example, the micro-averaging strategy is sensitive both to cluster imbalance and outliers (background noise) while macro-averaging is far more robust to both. Furthermore, the latter allows cluster-balanced sampling which yields robust computation of the silhouette score. By conducting an experimental study on eight real-world datasets, estimating the ground truth number of clusters, we show that both coefficients, micro and macro, should be considered.
翻译:轮廓系数是一种成熟的内部聚类评估指标,可为每个数据点生成评分,以评估其聚类分配的质量。为评估整个数据集的聚类质量,数据集中所有点的评分要么通过(微观)平均得到单一值,要么在聚类层面取平均后进行(宏观)平均。如本文通过合成示例所展示,微观平均策略对聚类不平衡和异常值(背景噪声)均敏感,而宏观平均对两者的鲁棒性则强得多。此外,后者允许聚类平衡采样,从而能稳健地计算轮廓分数。通过对八个真实数据集的实验研究(估计真实聚类数),我们表明微观系数和宏观系数均应予以考虑。