Selecting the number of communities is a fundamental challenge in network clustering. The silhouette score offers an intuitive, model-free criterion that balances within-cluster cohesion and between-cluster separation. Albeit its widespread use in clustering analysis, its performance in network-based community detection remains insufficiently characterized. In this study, we comprehensively evaluate the performance of the silhouette score across unweighted, weighted, and fully connected networks, examining how network size, separation strength, and community size imbalance influence its performance. Simulation studies show that the silhouette score accurately identifies the true number of communities when clusters are well separated and balanced, but it tends to underestimate under strong imbalance or weak separation and to overestimate in sparse networks. Extending the evaluation to a real airline reachability network, we demonstrate that the silhouette-based clustering can recover geographically interpretable and market-oriented clusters. These findings provide empirical guidance for applying the silhouette score in network clustering and clarify the conditions under which its use is most reliable.
翻译:选择社区数量是网络聚类中的一个基本挑战。轮廓分数提供了一种直观、无模型的准则,能够平衡簇内凝聚性和簇间分离度。尽管该指标在聚类分析中广泛应用,但其在网络社区检测中的性能尚未得到充分表征。本研究全面评估了轮廓分数在无权、加权及全连接网络中的性能表现,探究了网络规模、分离强度和社区规模不平衡性对其性能的影响。仿真研究表明:当聚类簇分离良好且规模均衡时,轮廓分数能准确识别真实社区数量;但在强不平衡或弱分离条件下易出现低估,在稀疏网络中则倾向于高估。通过对真实航空可达性网络的扩展评估,我们证明基于轮廓分数的聚类能够恢复具有地理可解释性和市场导向性的聚类簇。这些发现为轮廓分数在网络聚类中的应用提供了实证指导,并明确了其最可靠的使用条件。