In order for large language model (LLM)-based assistants to effectively adapt to evolving information needs, it must be possible to update their factual knowledge through continued training on new data. The standard recipe for doing so involves continued pre-training on new documents followed by instruction-tuning on question-answer (QA) pairs. However, we find that LLMs trained with this recipe struggle to answer questions, even though the perplexity of documents is minimized. We found that QA pairs are generally straightforward, while documents are more complex, weaving many factual statements together in an intricate manner. Therefore, we hypothesize that it is beneficial to expose LLMs to QA pairs before continued pre-training on documents so that the process of encoding knowledge from complex documents takes into account how this knowledge is accessed through questions. Based on this, we propose pre-instruction-tuning (PIT), a method that instruction-tunes on questions prior to training on documents. This contrasts with standard instruction-tuning, which learns how to extract knowledge after training on documents. Extensive experiments and ablation studies demonstrate that PIT significantly enhances the ability of LLMs to absorb knowledge from new documents, outperforming standard instruction-tuning by 17.8%.
翻译:为了使基于大型语言模型(LLM)的助手能够有效适应不断变化的信息需求,必须通过持续训练新数据来更新其事实知识。标准的做法包括对新文档进行持续预训练,接着在问答(QA)对上进行指令调优。然而,我们发现,即使文档的困惑度被最小化,采用此配方训练的LLM仍难以回答提问。我们注意到,QA对通常较为直接,而文档则更为复杂,以错综复杂的方式交织多个事实陈述。因此,我们假设,在持续预训练文档之前,先让LLM接触QA对是有益的,这样从复杂文档中编码知识的过程就能考虑到如何通过问题访问这些知识。基于此,我们提出了预指令调优(PIT)方法,即在训练文档之前先基于问题进行指令调优。这与标准的指令调优形成对比,后者是在文档训练之后学习如何提取知识。广泛的实验和消融研究表明,PIT显著增强了LLM从新文档中吸收知识的能力,比标准指令调优性能提升了17.8%。