Long context understanding remains challenging for large language models due to their limited context windows. This paper introduces Long Input Fine-Tuning (LIFT) for long context modeling, a novel framework that enhances LLM performance on long-context tasks by adapting model parameters to the context at test time. LIFT enables efficient processing of lengthy inputs without the computational burden of offline long-context adaptation, and can improve the long-context capabilities of arbitrary short-context models. The framework is further enhanced by integrating in-context learning and pre-LIFT supervised fine-tuning. The combination of in-context learning and LIFT enables short-context models like Llama 3 to handle arbitrarily long contexts and consistently improves their performance on popular long-context benchmarks like LooGLE and LongBench. We also provide a comprehensive analysis of the strengths and limitations of LIFT on long context understanding, offering valuable directions for future research.
翻译:长上下文理解对于大语言模型而言仍具挑战性,这主要受限于其有限的上下文窗口。本文针对长上下文建模提出了长输入微调(Long Input Fine-Tuning, LIFT)框架,该创新方法通过在测试时使模型参数适应上下文,从而提升LLM在长上下文任务上的性能。LIFT能够高效处理长输入,无需承担离线长上下文适应的计算负担,并可增强任意短上下文模型的长上下文能力。该框架通过集成上下文学习及LIFT前监督微调得到进一步强化。上下文学习与LIFT的结合,使得如Llama 3之类的短上下文模型能够处理任意长度的上下文,并在LooGLE和LongBench等主流长上下文基准测试中持续提升其性能。我们还对LIFT在长上下文理解方面的优势与局限进行了全面分析,为未来研究提供了有价值的指导方向。