Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications. In this paper, we demonstrate the interchangeability of three popular and distinct adaptation tools: parameter updating, reward modeling, and in-context prompting. This interchangeability establishes a triangular framework with six transformation directions, each of which facilitates a variety of applications. Our work offers a holistic view that unifies numerous existing studies and suggests potential research directions. We envision our work as a useful roadmap for future research on LLMs.
翻译:尽管预训练大型语言模型(LLM)已具备通用能力,仍需进一步适配以更好地服务于实际应用。本文论证了三种流行且各异的适配工具——参数更新、奖励建模与上下文提示——之间的可互换性。这种互换性建立了一个包含六个转换方向的三角框架,每个方向均可推动多种应用。本研究提供了一个统一诸多现有工作并指明潜在研究方向的整体性视角。我们期望本工作能为未来LLM研究提供有价值的路线图。