Large language models (LLMs) have a surprising failure: when trained on "A has a feature B", they do not generalize to "B is a feature of A", which is termed the Reversal Curse. Even when training with trillions of tokens this issue still appears due to Zipf's law - hence even if we train on the entire internet. This work proposes an alternative training scheme, called reverse training, whereby all words are used twice, doubling the amount of available tokens. The LLM is trained in both forward and reverse directions by reversing the training strings while preserving (i.e., not reversing) chosen substrings, such as entities. We show that data-matched reverse-trained models provide superior performance to standard models on standard tasks, and compute-matched reverse-trained models provide far superior performance on reversal tasks, helping resolve the reversal curse issue.
翻译:大型语言模型(LLMs)存在一个令人意外的缺陷:当训练数据包含“A具有特征B”时,模型无法泛化至“B是A的特征”,此现象被称为逆转诅咒。即便使用数万亿token进行训练,由于齐普夫定律的存在,该问题依然无法避免——即使训练数据涵盖整个互联网亦是如此。本文提出一种名为“逆向训练”的替代训练方案,通过将每个单词使用两次来倍增可用token数量。该方案通过反转训练字符串中除选定子串(如实体)以外的部分,使LLM同时以前向和反向方向进行训练。实验表明,在数据量匹配条件下,逆向训练模型在标准任务上的性能优于传统模型;在计算量匹配条件下,逆向训练模型在逆转任务上展现出显著优势,有效解决了逆转诅咒问题。