With the rapid proliferation of scientific literature, versatile academic knowledge services increasingly rely on comprehensive academic graph mining. Despite the availability of public academic graphs, benchmarks, and datasets, these resources often fall short in multi-aspect and fine-grained annotations, are constrained to specific task types and domains, or lack underlying real academic graphs. In this paper, we present OAG-Bench, a comprehensive, multi-aspect, and fine-grained human-curated benchmark based on the Open Academic Graph (OAG). OAG-Bench covers 10 tasks, 20 datasets, 70+ baselines, and 120+ experimental results to date. We propose new data annotation strategies for certain tasks and offer a suite of data pre-processing codes, algorithm implementations, and standardized evaluation protocols to facilitate academic graph mining. Extensive experiments reveal that even advanced algorithms like large language models (LLMs) encounter difficulties in addressing key challenges in certain tasks, such as paper source tracing and scholar profiling. We also introduce the Open Academic Graph Challenge (OAG-Challenge) to encourage community input and sharing. We envisage that OAG-Bench can serve as a common ground for the community to evaluate and compare algorithms in academic graph mining, thereby accelerating algorithm development and advancement in this field. OAG-Bench is accessible at https://www.aminer.cn/data/.
翻译:随着科学文献的快速增长,多样化的学术知识服务日益依赖于全面的学术图挖掘。尽管存在公开的学术图、基准测试和数据集,但这些资源往往缺乏多方面的精细标注、局限于特定任务类型与领域,或缺乏底层真实的学术图。本文基于开放学术图(OAG)提出了OAG-Bench——一个全面、多维度、细粒度的人工精选基准。该基准目前涵盖10项任务、20个数据集、70余种基线模型及120余项实验结果。我们针对特定任务提出了新的数据标注策略,并提供了数据预处理代码、算法实现及标准化评估协议套件,以促进学术图挖掘研究。大量实验表明,即便是大语言模型等先进算法,在论文溯源、学者画像等任务的关键挑战中仍面临困难。我们还推出了Open Academic Graph Challenge(OAG-Challenge)以鼓励社区贡献与共享。我们期待OAG-Bench能成为社区评估和比较学术图挖掘算法的通用平台,从而加速该领域的算法发展与进步。OAG-Bench可通过https://www.aminer.cn/data/访问。