Uncertainty estimation is a significant issue for current large language models (LLMs) that are generally poorly calibrated and over-confident, especially with reinforcement learning from human feedback (RLHF). Unlike humans, whose decisions and confidences not only stem from intrinsic beliefs but can also be adjusted through daily observations, existing calibration methods for LLMs focus on estimating or eliciting individual confidence without taking full advantage of the "Collective Wisdom": the interaction among multiple LLMs that can collectively improve both accuracy and calibration. In this work, we propose Collaborative Calibration, a post-hoc training-free calibration strategy that leverages the collaborative and expressive capabilities of multiple tool-augmented LLM agents in a simulated group deliberation process. We demonstrate the effectiveness of Collaborative Calibration on generative QA tasks across various domains, showing its potential in harnessing the rationalization of collectively calibrated confidence assessments and improving the reliability of model predictions.
翻译:不确定性估计是当前大语言模型面临的关键问题——这些模型普遍存在校准不良和过度自信的缺陷,尤其在经过人类反馈强化学习微调后更为突出。与人类不同,人类决策的置信度不仅源于内在信念,还能通过日常观察进行动态调整。而现有的大语言模型校准方法集中于个体置信度的估计或诱发,未充分挖掘"集体智慧"的潜力:多模型交互既能提升预测精度,也能改善校准效果。本文提出协同校准方法,这是一种无需重新训练的后处理方法,通过模拟群体辩论过程,利用多个工具增强型大语言模型智能体的协作与表达能力。我们在多个领域的生成式问答任务中验证了协同校准的有效性,展示了其在利用集体校准置信度评估实现推理过程合理化、提升模型预测可靠性方面的潜力。