Continual Learning (CL) is the process of learning ceaselessly a sequence of tasks. Most existing CL methods deal with independent data (e.g., images and text) for which many benchmark frameworks and results under standard experimental settings are available. However, CL methods for graph data (graph CL) are surprisingly underexplored because of (a) the lack of standard experimental settings, especially regarding how to deal with the dependency between instances, (b) the lack of benchmark datasets and scenarios, and (c) high complexity in implementation and evaluation due to the dependency. In this paper, regarding (a), we define four standard incremental settings (task-, class-, domain-, and time-incremental) for graph data, which are naturally applied to many node-, link-, and graph-level problems. Regarding (b), we provide 25 benchmark scenarios based on 15 real-world graphs. Regarding (c), we develop BeGin, an easy and fool-proof framework for graph CL. BeGin is easily extended since it is modularized with reusable modules for data processing, algorithm design, and evaluation. Especially, the evaluation module is completely separated from user code to eliminate potential mistakes. Using all the above, we report extensive benchmark results of 10 graph CL methods. Compared to the latest benchmark for graph CL, using BeGin, we cover 3x more combinations of incremental settings and levels of problems. All assets for the benchmark framework are available at https://github.com/ShinhwanKang/BeGin.
翻译:连续学习是指持续学习一系列任务的过程。现有大多数连续学习方法处理独立数据(如图像和文本),并已在标准实验设置下提供了大量基准框架与结果。然而,图数据的连续学习方法因以下原因尚未得到充分探索:(a)缺乏标准实验设置,尤其是如何处理实例间的依赖关系;(b)缺乏基准数据集与场景;(c)由于依赖关系带来的实现与评估复杂性。针对问题(a),本文为图数据定义了四种标准增量设置(任务增量、类别增量、领域增量和时间增量),这些设置自然适用于众多节点级、连边级和图级问题。针对问题(b),我们基于15个真实世界图提供了25个基准场景。针对问题(c),我们开发了BeGin——一个简便且防错的图连续学习框架。BeGin采用模块化设计,包含可复用的数据处理、算法设计与评估模块,易于扩展。特别是,评估模块完全独立于用户代码,以消除潜在错误。基于上述工作,我们报告了10种图连续学习方法的广泛基准结果。与最新的图连续学习基准相比,使用BeGin后,我们覆盖的增量设置与问题级别组合数量增加了3倍。本基准框架的所有资源均可在https://github.com/ShinhwanKang/BeGin获取。