Dynamic and Textual Graph Generation Via Large-Scale LLM-based Agent Simulation

Graph generation is a fundamental task that has been extensively studied in social, technological, and scientific analysis. For modeling the dynamic graph evolution process, traditional rule-based methods struggle to capture community structures within graphs, while deep learning methods only focus on fitting training graphs. This limits existing graph generators to producing graphs that adhere to predefined rules or closely resemble training datasets, achieving poor performance in dynamic graph generation. Given that graphs are abstract representations arising from pairwise interactions in human activities, a realistic simulation of human-wise interaction could provide deeper insights into the graph evolution mechanism. With the increasing recognition of large language models (LLMs) in simulating human behavior, we introduce GraphAgent-Generator (GAG), a novel simulation-based framework for dynamic graph generation. Without training or fine-tuning process of LLM, our framework effectively replicates seven macro-level structural characteristics in established network science theories while surpassing existing baselines in graph expansion tasks by 31\% on specific evaluation metrics. Through node classification task, we validate GAG effectively preserves characteristics of real-world network for node-wise textual features in generated text-rich graph. Furthermore, by incorporating parallel acceleration, GAG supports generating graphs with up to nearly 100,000 nodes or 10 million edges through large-scale LLM-based agent simulation, with a minimum speed-up of 90.4\%. The source code is available at https://anonymous.4open.science/r/GraphAgent-2206.

翻译：图生成是社交、技术和科学分析中广泛研究的基础任务。在建模动态图演化过程时，传统的基于规则的方法难以捕捉图中的社区结构，而深度学习方法仅专注于拟合训练图。这导致现有图生成器只能产生遵循预定义规则或与训练数据集高度相似的图，在动态图生成任务中表现不佳。鉴于图是人类活动中成对交互所产生的抽象表示，对人类交互行为进行真实仿真可为图演化机制提供更深入的洞见。随着大语言模型在模拟人类行为方面的能力日益受到认可，我们提出了GraphAgent-Generator，一种基于仿真的新型动态图生成框架。该框架无需对大语言模型进行训练或微调，即能有效复现既有网络科学理论中的七种宏观结构特征，并在图扩展任务的具体评估指标上以31%的优势超越现有基线方法。通过节点分类任务，我们验证了GAG能在生成的富文本图中有效保持真实网络在节点级文本特征方面的特性。此外，通过引入并行加速技术，GAG支持通过基于大规模LLM的智能体仿真生成包含近10万个节点或1000万条边的大型图，其最低加速比达到90.4%。源代码发布于https://anonymous.4open.science/r/GraphAgent-2206。