A novel and intuitive nearest neighbours based clustering algorithm is introduced, in which a cluster is defined in terms of an equilibrium condition which balances its size and cohesiveness. The formulation of the equilibrium condition allows for a quantification of the strength of alignment of each point to a cluster, with these cluster alignment strengths leading naturally to a model selection criterion which renders the proposed approach fully automatable. The algorithm is simple to implement and computationally efficient, and produces clustering solutions of extremely high quality in comparison with relevant benchmarks from the literature. R code to implement the approach is available from https://github.com/DavidHofmeyr/NNEC.
翻译:本文提出了一种新颖且直观的基于最近邻的聚类算法,其中通过平衡簇规模与内聚性的均衡条件来定义簇。该均衡条件的数学表达允许量化每个数据点对簇的归属强度,这些簇归属强度自然地导出了一个模型选择准则,从而使所提出的方法实现完全自动化。该算法实现简单、计算高效,与文献中的相关基准方法相比,能够产生质量极高的聚类解。实现该方法的R代码可从https://github.com/DavidHofmeyr/NNEC获取。