Large language models achieve state-of-the-art performance on sequence generation evaluation, but typically have a large number of parameters. This is a computational challenge as presented by applying their evaluation capability at scale. To overcome the challenge, in this paper, we propose \textbf{ECT}, an \textbf{e}valuation \textbf{c}apability \textbf{t}ransfer method, to transfer the evaluation capability from LLMs to relatively lightweight language models. Based on the proposed ECT, we learn various evaluation models from ChatGPT, and employ them as reward models to improve sequence generation models via reinforcement learning and reranking approaches. Experimental results on machine translation, text style transfer, and summarization tasks demonstrate the effectiveness of our ECT. Notably, applying the learned evaluation models to sequence generation models results in better generated sequences as evaluated by commonly used metrics and ChatGPT.
翻译:大型语言模型在序列生成评估上达到了最先进的性能,但通常拥有大量参数。这带来了计算挑战,因为需要大规模应用它们的评估能力。为克服这一挑战,本文提出**ECT**,一种**评**估**能**力**迁**移方法,用于将评估能力从大型语言模型迁移到相对轻量级的语言模型。基于所提出的ECT,我们从ChatGPT学习多种评估模型,并将其作为奖励模型,通过强化学习和重排序方法改进序列生成模型。在机器翻译、文本风格迁移和摘要任务上的实验结果表明,我们的ECT方法有效。值得注意的是,将学习到的评估模型应用于序列生成模型,能够生成更优的序列,这一点由常用指标和ChatGPT评估所证实。