Foundation Models (FMs) serve as a general class for the development of artificial intelligence systems, offering broad potential for generalization across a spectrum of downstream tasks. Despite extensive research into self-supervised learning as the cornerstone of FMs, several outstanding issues persist in Graph Foundation Models that rely on graph self-supervised learning, namely: 1) Homogenization. The extent of generalization capability on downstream tasks remains unclear. 2) Scalability. It is unknown how effectively these models can scale to large datasets. 3) Efficiency. The training time and memory usage of these models require evaluation. 4) Training Stop Criteria. Determining the optimal stopping strategy for pre-training across multiple tasks to maximize performance on downstream tasks. To address these questions, we have constructed a rigorous benchmark that thoroughly analyzes and studies the generalization and scalability of self-supervised Graph Neural Network (GNN) models. Regarding generalization, we have implemented and compared the performance of various self-supervised GNN models, trained to generate node representations, across tasks such as node classification, link prediction, and node clustering. For scalability, we have compared the performance of various models after training using full-batch and mini-batch strategies. Additionally, we have assessed the training efficiency of these models by conducting experiments to test their GPU memory usage and throughput. Through these experiments, we aim to provide insights to motivate future research. The code for this benchmark is publicly available at https://github.com/NYUSHCS/GraphFM.
翻译:基础模型作为人工智能系统开发的一类通用范式,为广泛下游任务的泛化提供了巨大潜力。尽管自监督学习作为基础模型的基石已得到广泛研究,但依赖图自监督学习的图基础模型仍存在若干突出问题:1) 同质化问题。模型在下游任务上的泛化能力范围尚不明确。2) 可扩展性问题。这些模型处理大规模数据集的有效性未知。3) 效率问题。需要评估这些模型的训练时间和内存使用情况。4) 训练停止准则。如何确定跨多任务预训练的最佳停止策略以最大化下游任务性能。为探究这些问题,我们构建了一个严谨的基准测试,全面分析和研究了自监督图神经网络模型的泛化能力与可扩展性。在泛化能力方面,我们实现并比较了多种自监督GNN模型(经训练生成节点表示)在节点分类、链接预测和节点聚类等任务上的性能。针对可扩展性,我们比较了采用全批次和小批次训练策略后各模型的性能表现。此外,我们通过实验测试GPU内存使用量和吞吐率,评估了这些模型的训练效率。通过这些实验,我们旨在为未来研究提供启示。本基准测试代码已公开于 https://github.com/NYUSHCS/GraphFM。