Most of the existing clustering methods are based on a single granularity of information, such as the distance and density of each data. This most fine-grained based approach is usually inefficient and susceptible to noise. Therefore, we propose a clustering algorithm that combines multi-granularity Granular-Ball and minimum spanning tree (MST). We construct coarsegrained granular-balls, and then use granular-balls and MST to implement the clustering method based on "large-scale priority", which can greatly avoid the influence of outliers and accelerate the construction process of MST. Experimental results on several data sets demonstrate the power of the algorithm. All codes have been released at https://github.com/xjnine/GBMST.
翻译:现有的大多数聚类方法都基于单一粒度的信息,例如每个数据的距离和密度。这种基于最细粒度的方法通常效率低下且易受噪声影响。因此,我们提出一种结合多粒度颗粒球与最小生成树(MST)的聚类算法。我们构建粗粒度颗粒球,然后利用颗粒球和MST实现基于“大尺度优先”的聚类方法,这能极大避免异常值的影响并加速MST的构建过程。在多个数据集上的实验结果证明了该算法的有效性。所有代码已发布在https://github.com/xjnine/GBMST。