Fine-tuning a language model on a new domain is standard practice for domain adaptation. However, it can be infeasible when it comes to modern large-scale language models such as GPT-3, which can only be accessed through APIs, making it difficult to access the internal parameters of the model. In this paper, we propose $k$NN-Adapter, a method to effectively adapt these black-box large language models (LLMs) to a new domain. The $k$NN-Adapter builds on top of the retrieval-augmented language model, and adaptively learns to interpolate the output of the language model with retrieval results from a datastore consisting of the target domain data. Our experiments on four different domains demonstrate that $k$NN-Adapter significantly improves perplexity, and works particularly well in settings with limited access to LLMs. Additionally, we show that $k$NN-Adapter is more effective than fine-tuning when the amount of training data is limited. We also release a dataset to encourage further study.
翻译:在目标领域上微调语言模型是领域适应的标准做法。然而,当涉及现代大规模语言模型(如GPT-3)时,该做法可能不可行,因为这些模型仅能通过API访问,导致难以获取模型内部参数。本文提出$k$NN-Adapter方法,用于有效适应这些黑盒大语言模型(LLMs)至新领域。该方法基于检索增强语言模型构建,能够自适应地学习将语言模型的输出与来自目标领域数据存储库的检索结果进行插值融合。我们在四个不同领域的实验表明,$k$NN-Adapter能够显著降低困惑度,尤其在语言模型访问受限的场景下表现出色。此外,我们证明当训练数据量有限时,$k$NN-Adapter比微调更为有效。我们同时发布了一个数据集以促进后续研究。