Cognitive research indicates that abstraction ability is essential in human intelligence, which remains under-explored in language models. In this paper, we present AbsPyramid, a unified entailment graph of 221K textual descriptions of abstraction knowledge. While existing resources only touch nouns or verbs within simplified events or specific domains, AbsPyramid collects abstract knowledge for three components of diverse events to comprehensively evaluate the abstraction ability of language models in the open domain. Experimental results demonstrate that current LLMs face challenges comprehending abstraction knowledge in zero-shot and few-shot settings. By training on our rich abstraction knowledge, we find LLMs can acquire basic abstraction abilities and generalize to unseen events. In the meantime, we empirically show that our benchmark is comprehensive to enhance LLMs across two previous abstraction tasks.
翻译:认知研究表明,抽象能力是人类智能的核心要素,但这一能力在语言模型中尚未得到充分探索。本文提出了AbsPyramid,一个包含22.1万条抽象知识文本描述的统一蕴涵图。现有资源仅涉及简化事件或特定领域中的名词或动词,而AbsPyramid则收集了多样化事件中三个组件的抽象知识,以全面评估语言模型在开放领域的抽象能力。实验结果表明,当前大型语言模型在零样本和少样本场景下理解抽象知识仍面临挑战。通过基于我们丰富的抽象知识进行训练,我们发现语言模型能够获得基本的抽象能力,并泛化至未见事件。同时,我们通过实证表明,该基准测试能够有效增强语言模型在两项先前抽象任务中的表现。