Data stream mining aims at extracting meaningful knowledge from continually evolving data streams, addressing the challenges posed by nonstationary environments, particularly, concept drift which refers to a change in the underlying data distribution over time. Graph structures offer a powerful modelling tool to represent complex systems, such as, critical infrastructure systems and social networks. Learning from graph streams becomes a necessity to understand the dynamics of graph structures and to facilitate informed decision-making. This work introduces a novel method for graph stream classification which operates under the general setting where a data generating process produces graphs with varying nodes and edges over time. The method uses incremental learning for continual model adaptation, selecting representative graphs (prototypes) for each class, and creating graph embeddings. Additionally, it incorporates a loss-based concept drift detection mechanism to recalculate graph prototypes when drift is detected.
翻译:数据流挖掘旨在从持续演变的数据流中提取有意义的知识,以应对非平稳环境带来的挑战,特别是随时间推移底层数据分布发生变化的概念漂移问题。图结构为关键基础设施系统与社会网络等复杂系统提供了强大的建模工具。学习图流成为理解图结构动态变化、促进科学决策的必然需求。本文提出了一种新颖的图流分类方法,该方法在通用场景下运行——数据生成过程会随时间产生节点和边不断变化的图。该方法通过增量学习实现模型的持续自适应,为每个类别选择代表性图(原型),并创建图嵌入。此外,该方法融合了基于损失的概念漂移检测机制,在检测到漂移时重新计算图原型。