Data stream mining aims at extracting meaningful knowledge from continually evolving data streams, addressing the challenges posed by nonstationary environments, particularly, concept drift which refers to a change in the underlying data distribution over time. Graph structures offer a powerful modelling tool to represent complex systems, such as, critical infrastructure systems and social networks. Learning from graph streams becomes a necessity to understand the dynamics of graph structures and to facilitate informed decision-making. This work introduces a novel method for graph stream classification which operates under the general setting where a data generating process produces graphs with varying nodes and edges over time. The method uses incremental learning for continual model adaptation, selecting representative graphs (prototypes) for each class, and creating graph embeddings. Additionally, it incorporates a loss-based concept drift detection mechanism to recalculate graph prototypes when drift is detected.
翻译:数据流挖掘旨在从持续演化的数据流中提取有意义的知识,应对非平稳环境带来的挑战,特别是反映数据底层分布随时间变化的概念漂移问题。图结构作为关键基础设施系统与社交网络等复杂系统的强大建模工具,使得从图流中学习成为理解图结构动态演变并推动决策制定的必要手段。本文提出一种面向图流分类的新方法,该方法适用于数据生成过程产生随时间变化的节点与边结构的通用场景。该方法采用增量学习实现模型持续自适应,为每个类别选择代表性图(原型)并创建图嵌入。此外,方法集成基于损失的概念漂移检测机制,在检测到漂移时重新计算图原型。