The domain of machine learning is confronted with a crucial research area known as class imbalance learning, which presents considerable hurdles in precise classification of minority classes. This issue can result in biased models where the majority class takes precedence in the training process, leading to the underrepresentation of the minority class. The random vector functional link (RVFL) network is a widely used and effective learning model for classification due to its good generalization performance and efficiency. However, it suffers when dealing with imbalanced datasets. To overcome this limitation, we propose a novel graph embedded intuitionistic fuzzy RVFL for class imbalance learning (GE-IFRVFL-CIL) model incorporating a weighting mechanism to handle imbalanced datasets. The proposed GE-IFRVFL-CIL model offers plethora of benefits: $(i)$ leveraging graph embedding to preserve the inherent topological structure of the datasets, $(ii)$ employing intuitionistic fuzzy theory to handle uncertainty and imprecision in the data, $(iii)$ and the most important, it tackles class imbalance learning. The amalgamation of a weighting scheme, graph embedding, and intuitionistic fuzzy sets leads to the superior performance of the proposed models on KEEL benchmark imbalanced datasets with and without Gaussian noise. Furthermore, we implemented the proposed GE-IFRVFL-CIL on the ADNI dataset and achieved promising results, demonstrating the model's effectiveness in real-world applications. The proposed GE-IFRVFL-CIL model offers a promising solution to address the class imbalance issue, mitigates the detrimental effect of noise and outliers, and preserves the inherent geometrical structures of the dataset.
翻译:机器学习领域面临一项关键研究课题——类别不平衡学习,该问题给少数类的精确分类带来了巨大挑战。此类问题可能导致模型存在偏差,使多数类在训练过程中占据主导地位,进而造成少数类表征不足。随机向量函数链接网络因其良好的泛化性能和高效性成为广泛使用的有效分类学习模型,但在处理不平衡数据集时存在局限性。为克服此缺陷,我们提出了一种新型图嵌入直觉模糊RVFL模型用于类别不平衡学习,该模型引入加权机制以处理不平衡数据集。所提出的GE-IFRVFL-CIL模型具备多重优势:(i) 利用图嵌入保留数据集的固有拓扑结构,(ii) 采用直觉模糊理论处理数据中的不确定性与不精确性,(iii) 最重要的是,它解决了类别不平衡学习问题。加权方案、图嵌入与直觉模糊集的融合使所提模型在含高斯噪声与不含高斯噪声的KEEL基准不平衡数据集上均表现出优越性能。此外,我们在ADNI数据集上实施GE-IFRVFL-CIL并获得理想结果,验证了该模型在实际应用中的有效性。该GE-IFRVFL-CIL模型为解决类别不平衡问题提供了可行方案,能够减轻噪声和异常值的负面影响,并保留数据集的固有几何结构。