Uncertainty estimation is a significant issue for current large language models (LLMs) that are generally poorly calibrated and over-confident, especially with reinforcement learning from human feedback (RLHF). Unlike humans, whose decisions and confidences not only stem from intrinsic beliefs but can also be adjusted through daily observations, existing calibration methods for LLMs focus on estimating or eliciting individual confidence without taking full advantage of the "Collective Wisdom": the interaction among multiple LLMs that can collectively improve both accuracy and calibration. In this work, we propose Collaborative Calibration, a post-hoc training-free calibration strategy that leverages the collaborative and expressive capabilities of multiple tool-augmented LLM agents in a simulated group deliberation process. We demonstrate the effectiveness of Collaborative Calibration on generative QA tasks across various domains, showing its potential in harnessing the rationalization of collectively calibrated confidence assessments and improving the reliability of model predictions.
翻译:不确定性估计是当前大语言模型面临的关键问题,这类模型普遍存在校准不足且过度自信的缺陷,尤其是在经过人类反馈强化学习优化后。与人类不同——人类的决策和置信度不仅源于内在信念,还可通过日常观察进行调整——现有的大语言模型校准方法主要侧重于估计或提取个体置信度,未能充分利用"集体智慧":多个大语言模型间的交互可同时提升准确性与校准水平。本文提出协作校准法,这是一种无需训练的后处理校准策略,通过在模拟群体讨论过程中利用多个工具增强型大语言模型智能体的协作与表达能力。我们在跨领域生成式问答任务上验证了该方法的有效性,展示了其在利用集体校准置信度评估的理性推理能力以及提升模型预测可靠性方面的潜力。