We present the Temporal Graph Benchmark (TGB), a collection of challenging and diverse benchmark datasets for realistic, reproducible, and robust evaluation of machine learning models on temporal graphs. TGB datasets are of large scale, spanning years in duration, incorporate both node and edge-level prediction tasks and cover a diverse set of domains including social, trade, transaction, and transportation networks. For both tasks, we design evaluation protocols based on realistic use-cases. We extensively benchmark each dataset and find that the performance of common models can vary drastically across datasets. In addition, on dynamic node property prediction tasks, we show that simple methods often achieve superior performance compared to existing temporal graph models. We believe that these findings open up opportunities for future research on temporal graphs. Finally, TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research, including data loading, experiment setup and performance evaluation. TGB will be maintained and updated on a regular basis and welcomes community feedback. TGB datasets, data loaders, example codes, evaluation setup, and leaderboards are publicly available at https://tgb.complexdatalab.com/.
翻译:我们提出时序图基准测试(Temporal Graph Benchmark,TGB),这是一组具有挑战性且多样化的基准数据集,用于对时序图上的机器学习模型进行真实、可重现且稳健的评估。TGB数据集规模庞大,时间跨度长达数年,包含节点级和边级预测任务,并覆盖社会网络、贸易网络、交易网络和交通网络等多个领域。针对这两类任务,我们基于实际应用场景设计了评估协议。我们对每个数据集进行了广泛的基准测试,发现常见模型的性能在不同数据集中可能存在显著差异。此外,在动态节点属性预测任务中,我们表明简单方法往往比现有时序图模型取得更优性能。我们相信这些发现为未来时序图研究开辟了新的方向。最后,TGB提供自动化机器学习流水线,用于可重现且易获取的时序图研究,包括数据加载、实验设置和性能评估。TGB将定期维护与更新,并欢迎社区反馈。TGB数据集、数据加载器、示例代码、评估设置及排行榜均公开于 https://tgb.complexdatalab.com/。