Large Language Models (LLMs) are increasingly applied to tasks involving structured inputs such as graphs. Abstract Meaning Representations (AMRs), which encode rich semantics as directed graphs, offer a rigorous testbed for evaluating LLMs on text generation from such structures. Yet, current methods often arbitrarily linearize AMRs, discarding key structural cues, or rely on architectures incompatible with standard LLMs. We introduce SAFT, a structure-aware fine-tuning approach that injects graph topology into pretrained LLMs without architectural changes. We compute direction-sensitive positional encodings from the magnetic Laplacian of transformed AMRs and project them into the embedding space of the LLM. While possibly applicable to any graph-structured inputs, we focus on AMR-to-text generation as a representative and challenging benchmark. SAFT sets a new state-of-the-art on AMR 3.0 with a 3.5 BLEU improvement over baselines. Gains scale with graph complexity, highlighting the value of structure-aware representations in enhancing LLM performance. SAFT offers a general and effective pathway for bridging structured data and language models.
翻译:大语言模型(LLMs)正日益广泛地应用于涉及图等结构化输入的任务。抽象意义表示(AMRs)作为一种有向图编码丰富语义的形式,为评估LLMs从此类结构生成文本的能力提供了一个严谨的测试平台。然而,现有方法通常随意地将AMR线性化,从而丢弃了关键的结构线索,或者依赖于与标准LLMs架构不兼容的模型。我们提出了SAFT,一种结构感知的微调方法,它无需改变模型架构即可将图拓扑信息注入预训练的LLMs。我们通过计算变换后AMR的磁拉普拉斯矩阵,得到方向敏感的定位编码,并将其投影到LLM的嵌入空间中。尽管该方法可能适用于任何图结构输入,但我们聚焦于AMR到文本生成这一具有代表性和挑战性的基准任务。SAFT在AMR 3.0数据集上取得了新的最先进性能,相较于基线模型,BLEU分数提升了3.5。性能增益随图复杂度的增加而提升,这突显了结构感知表示在增强LLM性能方面的价值。SAFT为连接结构化数据与语言模型提供了一条通用且有效的路径。