Text-Attributed Graphs (TAGs) are graphs of connected textual documents. Graph models can efficiently learn TAGs, but their training heavily relies on human-annotated labels, which are scarce or even unavailable in many applications. Large language models (LLMs) have recently demonstrated remarkable capabilities in few-shot and zero-shot TAG learning, but they suffer from scalability, cost, and privacy issues. Therefore, in this work, we focus on synergizing LLMs and graph models with their complementary strengths by distilling the power of LLMs to a local graph model on TAG learning. To address the inherent gaps between LLMs (generative models for texts) and graph models (discriminative models for graphs), we propose first to let LLMs teach an interpreter with rich textual rationale and then let a student model mimic the interpreter's reasoning without LLMs' textual rationale. Extensive experiments validate the efficacy of our proposed framework.
翻译:文本属性图(TAGs)是由文本文档相互连接构成的图结构。图模型能够高效地学习TAGs,但其训练高度依赖人工标注标签,而在许多应用中这些标签往往稀缺甚至不可用。近年来,大型语言模型(LLMs)在少样本和零样本TAG学习任务中展现出卓越能力,但存在可扩展性、成本和隐私问题。因此,本研究聚焦于通过将LLMs的能力蒸馏至局部图模型,在TAG学习任务中协同利用LLMs与图模型的互补优势。为弥合LLMs(文本生成模型)与图模型(图判别模型)之间的固有差异,我们提出先让LLMs通过丰富的文本推理指导一个解释器,再让学生模型模仿该解释器的推理过程(无需依赖LLMs的文本推理)。大量实验验证了我们所提出框架的有效性。