基于合成反馈加速无偏大语言模型评估 (Accelerating Unbiased LLM Evaluation via Synthetic Feedback)

When developing new large language models (LLMs), a key step is evaluating their final performance, often by computing the win-rate against a reference model based on external feedback. Human feedback is the gold standard, particularly for capturing nuanced qualities like coherence, readability, and alignment with human expectations. However, human evaluations are costly -- even for large tech companies -- and when conducted with active users, they may negatively impact user experience. A promising alternative is synthetic feedback, where evaluations are conducted by other large language models, including reward models. While this eliminates the need for costly human annotations, it introduces biases that may distort the evaluation process. In this work, we propose a statistically principled framework that integrates human and synthetic feedback to reduce reliance on human annotations while maintaining unbiased win-rate calculations. Our experiments demonstrate a reduction in human annotations by up to 12.2% with an off-the-shelf synthetic evaluator and up to 24.8% with a finetuned variant. Apart from being generalizable, scalable, and free of hyper-parameter tuning, our method offers predictable annotation savings, which can be estimated based on data-dependent characteristics.

翻译：在开发新的大语言模型时，关键步骤之一是评估其最终性能，通常通过基于外部反馈计算其相对于参考模型的胜率来实现。人类反馈是黄金标准，尤其在捕捉诸如连贯性、可读性及与人类期望的一致性等细微品质方面。然而，人类评估成本高昂——即使对大型科技公司而言也是如此——并且当与活跃用户一起进行时，可能会对用户体验产生负面影响。一种有前景的替代方案是合成反馈，即由其他大语言模型（包括奖励模型）进行评估。虽然这消除了对昂贵人工标注的需求，但也引入了可能扭曲评估过程的偏差。在本工作中，我们提出了一个基于统计原理的框架，该框架整合了人类与合成反馈，以减少对人类标注的依赖，同时保持无偏的胜率计算。我们的实验表明，使用现成的合成评估器可将人工标注减少高达12.2%，而使用微调变体则可减少高达24.8%。除了具有可推广性、可扩展性且无需超参数调优外，我们的方法还能提供可预测的标注节省，这可以根据数据相关特性进行估算。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日