Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become increasingly resource-intensive, or impossible when model weights are private. We introduce proxy-tuning, a lightweight decoding-time algorithm that operates on top of black-box LMs to achieve the result of directly tuning the model, but by accessing only its prediction over the output vocabulary. Our method instead tunes a smaller LM, then applies the difference between the predictions of the small tuned and untuned LMs to shift the original predictions of the base model in the direction of tuning, while retaining the benefits of larger scale pretraining. In experiments, when we apply proxy-tuning to Llama2-70B using proxies of only 7B size, we can close 88% of the gap between Llama2-70B and its truly-tuned chat version, when evaluated across knowledge, reasoning, and safety benchmarks. Interestingly, when tested on TruthfulQA, proxy-tuned models are actually more truthful than directly tuned models, possibly because decoding-time guidance better retains the model's factual knowledge. We then demonstrate the generality of proxy-tuning by applying it for domain adaptation on code, and task-specific finetuning on question-answering and math problems. Our work demonstrates the promise of using small tuned LMs to efficiently customize large, potentially proprietary LMs through decoding-time guidance.
翻译:尽管大型预训练语言模型具备通用能力,但通过进一步调整以更好地实现期望行为仍能持续受益。然而,这些模型的调优过程日益变得资源密集,甚至在模型权重不公开时无法实现。我们提出代理调优(proxy-tuning),这是一种轻量级解码时算法,作用于黑盒语言模型之上,通过仅访问模型在输出词汇上的预测结果,即可达到直接调优模型的效果。我们的方法转而调优一个较小的语言模型,然后利用该小型调优模型与未调优模型预测结果的差异,将基础模型的原始预测向调优方向偏移,同时保留大规模预训练的优势。实验表明,当使用仅有 7B 大小的代理模型对 Llama2-70B 进行代理调优时,在知识、推理和安全基准测试中,可缩小 Llama2-70B 与其真实调优聊天版本之间 88% 的性能差距。有趣的是,在 TruthfulQA 测试中,代理调优模型的实际真实性甚至高于直接调优模型,这可能是因为解码时引导能更好地保留模型的事实知识。我们进一步将代理调优应用于代码领域的域适应任务,以及问答与数学问题的任务特定微调,证明了该方法的广泛适用性。本研究表明,利用小型调优语言模型通过解码时引导,可高效定制大型(甚至可能专有)语言模型。