Large language models (LLMs) have achieved great success in many fields, and recent works have studied exploring LLMs for graph discriminative tasks such as node classification. However, the abilities of LLMs for graph generation remain unexplored in the literature. Graph generation requires the LLM to generate graphs with given properties, which has valuable real-world applications such as drug discovery, while tends to be more challenging. In this paper, we propose LLM4GraphGen to explore the ability of LLMs for graph generation with systematical task designs and extensive experiments. Specifically, we propose several tasks tailored with comprehensive experiments to address key questions regarding LLMs' understanding of different graph structure rules, their ability to capture structural type distributions, and their utilization of domain knowledge for property-based graph generation. Our evaluations demonstrate that LLMs, particularly GPT-4, exhibit preliminary abilities in graph generation tasks, including rule-based and distribution-based generation. We also observe that popular prompting methods, such as few-shot and chain-of-thought prompting, do not consistently enhance performance. Besides, LLMs show potential in generating molecules with specific properties. These findings may serve as foundations for designing good LLMs based models for graph generation and provide valuable insights and further research.
翻译:大型语言模型(LLMs)已在诸多领域取得显著成功,近期研究开始探索其在节点分类等图判别任务中的应用。然而,LLMs在图生成任务中的能力在文献中仍未被充分探讨。图生成要求LLM能生成具有指定属性的图,这在药物发现等实际应用中极具价值,但任务本身更具挑战性。本文提出LLM4GraphGen,通过系统化的任务设计和大量实验,探究LLMs在图生成任务中的能力。具体而言,我们设计了针对性的实验任务,以回答以下关键问题:LLMs对不同图结构规则的理解能力、其捕捉结构类型分布的能力、以及利用领域知识进行基于属性的图生成的能力。实验结果表明,LLMs(特别是GPT-4)在图生成任务中展现出初步能力,包括基于规则和基于分布的生成。我们还观察到,常用提示方法(如少样本提示和思维链提示)并不能持续提升性能。此外,LLMs在生成具有特定性质的分子方面展现出潜力。这些发现可作为设计基于LLMs的图生成模型的基础,并为后续研究提供宝贵见解。