Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabling them to learn new words and understand community-specific connotations. However, Large Language Models (LLMs) have a knowledge cutoff and are costly to finetune repeatedly. Therefore, it is crucial for LLMs to learn novel interpretations in-context. In this paper, we systematically analyse the ability of LLMs to acquire novel interpretations using in-context learning. To facilitate our study, we introduce MAGNIFICo, an evaluation suite implemented within a text-to-SQL semantic parsing framework that incorporates diverse tokens and prompt settings to simulate real-world complexity. Experimental results on MAGNIFICo demonstrate that LLMs exhibit a surprisingly robust capacity for comprehending novel interpretations from natural language descriptions as well as from discussions within long conversations. Nevertheless, our findings also highlight the need for further improvements, particularly when interpreting unfamiliar words or when composing multiple novel interpretations simultaneously in the same example. Additionally, our analysis uncovers the semantic predispositions in LLMs and reveals the impact of recency bias for information presented in long contexts.
翻译:摘要:人类具有为语言表达赋予新义解读的卓越能力,从而能够学习新词并理解特定社群的含义。然而,大语言模型存在知识截止日期,且重复微调成本高昂。因此,大语言模型具备通过上下文学习新义解读的能力至关重要。本文系统分析了大语言模型利用上下文学习获取新义解读的能力。为促进研究,我们提出了MAGNIFICo评估套件,该套件基于文本到SQL语义解析框架实现,融合了多样化的标记和提示设置以模拟现实世界的复杂性。MAGNIFICo上的实验结果表明,大语言模型展现出惊人的鲁棒性,不仅能从自然语言描述中理解新义解读,还能从长对话讨论中掌握新义。然而,我们的发现也凸显了进一步改进的需求,尤其是在解读不熟悉词汇或同一示例中同时组合多个新义解读时。此外,我们的分析揭示了大语言模型中的语义倾向性,并展示了长上下文中信息近因偏差的影响。