GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

Dynamic Text-Attributed Graphs (DyTAGs), which intricately integrate structural, temporal, and textual attributes, are crucial for modeling complex real-world systems. However, most existing DyTAG datasets exhibit poor textual quality, which severely limits their utility for generative DyTAG tasks requiring semantically rich inputs. Additionally, prior work mainly focuses on discriminative tasks on DyTAGs, resulting in a lack of standardized task formulations and evaluation protocols tailored for DyTAG generation. To address these critical issues, we propose Generative DyTAG Benchmark (GDGB), which comprises eight meticulously curated DyTAG datasets with high-quality textual features for both nodes and edges, overcoming limitations of prior datasets. Building on GDGB, we define two novel DyTAG generation tasks: Transductive Dynamic Graph Generation (TDGG) and Inductive Dynamic Graph Generation (IDGG). TDGG transductively generates a target DyTAG based on the given source and destination node sets, while the more challenging IDGG introduces new node generation to inductively model the dynamic expansion of real-world graph data. To enable holistic evaluation, we design multifaceted metrics that assess the structural, temporal, and textual quality of the generated DyTAGs. We further propose GAG-General, an LLM-based multi-agent generative framework tailored for reproducible and robust benchmarking of DyTAG generation. Experimental results demonstrate that GDGB enables rigorous evaluation of TDGG and IDGG, with key insights revealing the critical interplay of structural and textual features in DyTAG generation. These findings establish GDGB as a foundational resource for advancing generative DyTAG research and unlocking further practical applications in DyTAG generation. The dataset and source code are available at https://github.com/Lucas-PJ/GDGB-ALGO.

翻译：动态文本属性图（DyTAGs）深度融合了结构、时序与文本属性，对于建模复杂现实系统至关重要。然而，现有大多数DyTAG数据集的文本质量较差，严重限制了其在需要丰富语义输入的生成式DyTAG任务中的应用。此外，先前工作主要集中于DyTAG上的判别式任务，导致缺乏专门为DyTAG生成设计的标准化任务定义与评估方案。为应对这些关键问题，我们提出了生成式动态文本属性图基准（GDGB），该基准包含八个精心构建的DyTAG数据集，其节点与边均具备高质量的文本特征，克服了先前数据集的局限性。基于GDGB，我们定义了两项新颖的DyTAG生成任务：转导式动态图生成（TDGG）与归纳式动态图生成（IDGG）。TDGG基于给定的源节点集与目标节点集以转导方式生成目标DyTAG，而更具挑战性的IDGG则引入新节点生成，以归纳方式建模现实图数据的动态扩展。为实现全面评估，我们设计了多维度指标，用于评估生成DyTAG的结构质量、时序质量与文本质量。我们进一步提出了GAG-General，一个基于大语言模型的多智能体生成框架，专为DyTAG生成的可复现与鲁棒性基准测试而设计。实验结果表明，GDGB能够实现对TDGG与IDGG的严格评估，关键发现揭示了DyTAG生成中结构特征与文本特征的关键交互作用。这些成果确立了GDGB作为推动生成式DyTAG研究发展、解锁DyTAG生成进一步实际应用的基础性资源。数据集与源代码公开于https://github.com/Lucas-PJ/GDGB-ALGO。

相关内容

属性

关注 1

一个具体事物，总是有许许多多的性质与关系，我们把一个事物的性质与关系，都叫作事物的属性。事物与属性是不可分的，事物都是有属性的事物，属性也都是事物的属性。一个事物与另一个事物的相同或相异，也就是一个事物的属性与另一事物的属性的相同或相异。由于事物属性的相同或相异，客观世界中就形成了许多不同的事物类。具有相同属性的事物就形成一类，具有不同属性的事物就分别地形成不同的类。

【博士论文】深度生成表示学习

专知会员服务

35+阅读 · 2025年1月13日

【KDD2024】预训练和提示在文本属性图上的少样本节点分类

专知会员服务

14+阅读 · 2024年7月28日

北科大最新《分布变化下的图学习》综述，详述领域适应、非分布和持续学习进展

专知会员服务

45+阅读 · 2024年2月27日

【AAAI2023】DPText-DETR: 基于动态点query的场景文本检测，更高更快更鲁棒

专知会员服务

17+阅读 · 2023年1月23日