Granular ball computing (GBC), as an efficient, robust, and scalable learning method, has become a popular research topic of granular computing. GBC includes two stages: granular ball generation (GBG) and multi-granularity learning based on the granular ball (GB). However, the stability and efficiency of existing GBG methods need to be further improved due to their strong dependence on $k$-means or $k$-division. In addition, GB-based classifiers only unilaterally consider the GB's geometric characteristics to construct classification rules, but the GB's quality is ignored. Therefore, in this paper, based on the attention mechanism, a fast and stable GBG (GBG++) method is proposed first. Specifically, the proposed GBG++ method only needs to calculate the distances from the data-driven center to the undivided samples when splitting each GB instead of randomly selecting the center and calculating the distances between it and all samples. Moreover, an outlier detection method is introduced to identify local outliers. Consequently, the GBG++ method can significantly improve effectiveness, robustness, and efficiency while being absolutely stable. Second, considering the influence of the sample size within the GB on the GB's quality, based on the GBG++ method, an improved GB-based $k$-nearest neighbors algorithm (GB$k$NN++) is presented, which can reduce misclassification at the class boundary. Finally, the experimental results indicate that the proposed method outperforms several existing GB-based classifiers and classical machine learning classifiers on $24$ public benchmark datasets.
翻译:颗粒球计算(GBC)作为一种高效、鲁棒且可扩展的学习方法,已成为粒计算领域的热门研究课题。GBC包含两个阶段:颗粒球生成(GBG)和基于颗粒球(GB)的多粒度学习。然而,现有的GBG方法由于对k均值或k划分的高度依赖,其稳定性和效率有待进一步提高。此外,基于GB的分类器仅片面地考虑GB的几何特征来构建分类规则,而忽略了GB的质量。因此,本文首先基于注意力机制提出了一种快速稳定的GBG方法(GBG++)。具体而言,所提出的GBG++方法在分割每个GB时,只需计算从数据驱动中心到未分割样本的距离,而无需随机选择中心并计算其与所有样本的距离。同时,引入了一种异常检测方法来识别局部离群点。因此,GBG++方法能在保持绝对稳定的同时,显著提升有效性、鲁棒性和效率。其次,考虑到GB内样本数量对GB质量的影响,基于GBG++方法,提出了一种改进的基于GB的k近邻算法(GBkNN++),该算法能减少类别边界处的误分类。最后,实验结果表明,在24个公开基准数据集上,所提出的方法优于几种现有的基于GB的分类器和经典机器学习分类器。