The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

We expose a surprising failure of generalization in auto-regressive large language models (LLMs). If a model is trained on a sentence of the form "A is B", it will not automatically generalize to the reverse direction "B is A". This is the Reversal Curse. For instance, if a model is trained on "Olaf Scholz was the ninth Chancellor of Germany", it will not automatically be able to answer the question, "Who was the ninth Chancellor of Germany?". Moreover, the likelihood of the correct answer ("Olaf Scholz") will not be higher than for a random name. Thus, models exhibit a basic failure of logical deduction and do not generalize a prevalent pattern in their training set (i.e. if "A is B'' occurs, "B is A" is more likely to occur). We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as "Uriah Hawthorne is the composer of 'Abyssal Melodies'" and showing that they fail to correctly answer "Who composed 'Abyssal Melodies?'". The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation. We also evaluate ChatGPT (GPT-3.5 and GPT-4) on questions about real-world celebrities, such as "Who is Tom Cruise's mother? [A: Mary Lee Pfeiffer]" and the reverse "Who is Mary Lee Pfeiffer's son?". GPT-4 correctly answers questions like the former 79% of the time, compared to 33% for the latter. This shows a failure of logical deduction that we hypothesize is caused by the Reversal Curse. Code is available at https://github.com/lukasberglund/reversal_curse.

翻译：我们揭示了自回归大语言模型（LLMs）在泛化能力上的一个惊人缺陷。若模型在形如“A是B”的句子上训练，它不会自动泛化到逆向关系“B是A”。这就是逆转诅咒。例如，若模型在“奥拉夫·朔尔茨是德国第九任总理”上训练，它不会自动回答“谁是德国第九任总理？”这一问题。而且，正确答案（“奥拉夫·朔尔茨”）的概率并不高于随机姓名。因此，模型表现出逻辑推理的基本缺陷，未能泛化训练集中的常见模式（即若出现“A是B”，则“B是A”更可能发生）。我们通过在虚构陈述（如“乌利亚·霍桑是《深渊旋律》的作曲家”）上微调GPT-3和Llama-1，证明了逆转诅咒的存在：模型无法正确回答“谁创作了《深渊旋律》？”这一问题。逆转诅咒在模型规模和模型家族中具有稳健性，且数据增强无法缓解。我们还评估了ChatGPT（GPT-3.5和GPT-4）关于现实世界名人的问题，例如“汤姆·克鲁斯的母亲是谁？[答案：玛丽·李·法伊弗]”及其逆向问题“玛丽·李·法伊弗的儿子是谁？”。GPT-4正确回答前者的概率为79%，而后者仅为33%。这表明了一种逻辑推理缺陷，我们假设其由逆转诅咒导致。代码已开源：https://github.com/lukasberglund/reversal_curse

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日