In an era defined by rapid data evolution, traditional Machine Learning (ML) models often struggle to adapt to dynamic environments. Evolving Machine Learning (EML) has emerged as a pivotal paradigm, enabling continuous learning and real-time adaptation to streaming data. While prior surveys have examined individual components of evolving learning - such as drift detection - there remains a lack of a unified analysis of its major challenges. This survey provides a comprehensive overview of EML, focusing on four core challenges: data drift, concept drift, catastrophic forgetting, and skewed learning. We systematically review over 100 studies, categorizing state-of-the-art methods across supervised, unsupervised, and semi-supervised learning. The survey further explores evaluation metrics, benchmark datasets, and real-world applications, offering a comparative perspective on the effectiveness and limitations of current approaches and proposing a taxonomy to organize them. In addition, we highlight the growing role of adaptive neural architectures, meta-learning, and ensemble strategies in managing evolving data complexities. By synthesizing insights from recent literature, this work not only maps the current landscape of EML but also identifies key research gaps and emerging opportunities. Our findings aim to guide researchers and practitioners in developing robust, ethical, and scalable EML systems for real-world deployment.
翻译:在数据快速演变的时代,传统机器学习模型往往难以适应动态环境。演化机器学习作为一种关键范式应运而生,能够实现对流式数据的持续学习与实时适应。尽管已有综述分别探讨了演化学习的各个组成部分(如漂移检测),但对其主要挑战仍缺乏统一分析。本综述对演化机器学习进行了全面概述,聚焦于四大核心挑战:数据漂移、概念漂移、灾难性遗忘和偏态学习。我们系统回顾了百余项研究,将前沿方法按监督学习、无监督学习和半监督学习进行分类梳理。本文进一步探讨了评估指标、基准数据集和实际应用场景,通过比较视角分析现有方法的有效性与局限性,并提出分类体系以整合现有研究成果。此外,我们重点阐述了自适应神经架构、元学习以及集成策略在处理演化数据复杂性方面日益重要的作用。通过综合近期文献的洞见,本研究不仅描绘了演化机器学习的发展现状,同时指出了关键研究空白与新兴机遇。我们的研究成果旨在为研究人员和实践者开发鲁棒、可扩展且符合伦理的演化机器学习系统提供指导,以促进其在实际场景中的部署应用。