Analogical reasoning is a fundamental cognitive ability of humans. However, current language models (LMs) still struggle to achieve human-like performance in analogical reasoning tasks due to a lack of resources for model training. In this work, we address this gap by proposing ANALOGYKB, a million-scale analogy knowledge base (KB) derived from existing knowledge graphs (KGs). ANALOGYKB identifies two types of analogies from the KGs: 1) analogies of the same relations, which can be directly extracted from the KGs, and 2) analogies of analogous relations, which are identified with a selection and filtering pipeline enabled by large language models (LLMs), followed by minor human efforts for data quality control. Evaluations on a series of datasets of two analogical reasoning tasks (analogy recognition and generation) demonstrate that ANALOGYKB successfully enables both smaller LMs and LLMs to gain better analogical reasoning capabilities.
翻译:类比推理是人类的一项基本认知能力。然而,当前的语言模型在类比推理任务中仍难以达到类人表现,这主要是由于缺乏可用于模型训练的资源。在本研究中,我们通过提出ANALOGYKB来填补这一空白,这是一个从现有知识图谱中推导出的百万级类比知识库。ANALOGYKB从知识图谱中识别出两类类比:1)相同关系的类比,可直接从知识图谱中提取;2)类比关系的类比,通过结合大语言模型的选择和过滤流程进行识别,并辅以少量人工干预以确保数据质量。对两个类比推理任务(类比识别与生成)中多个数据集的评估结果表明,ANALOGYKB成功使小型语言模型和大语言模型均获得了更强的类比推理能力。