Large language models (LLMs) are highly capable of many tasks but they can sometimes generate unreliable or inaccurate outputs. To tackle this issue, this paper studies the problem of uncertainty estimation and calibration for LLMs. We begin by formulating the uncertainty estimation problem for LLMs and then propose a supervised approach that takes advantage of the labeled datasets and estimates the uncertainty of the LLMs' responses. Based on the formulation, we illustrate the difference between the uncertainty estimation for LLMs and that for standard ML models and explain why the hidden activations of the LLMs contain uncertainty information. Our designed approach effectively demonstrates the benefits of utilizing hidden activations for enhanced uncertainty estimation across various tasks and shows robust transferability in out-of-distribution settings. Moreover, we distinguish the uncertainty estimation task from the uncertainty calibration task and show that a better uncertainty estimation mode leads to a better calibration performance. In practice, our method is easy to implement and is adaptable to different levels of model transparency including black box, grey box, and white box, each demonstrating strong performance based on the accessibility of the LLM's internal mechanisms.
翻译:大语言模型(LLMs)在多项任务中展现出强大能力,但有时会生成不可靠或不准确的输出。为解决这一问题,本文研究了LLMs的不确定性估计与校准问题。我们首先形式化定义了LLMs的不确定性估计问题,随后提出一种监督方法,该方法利用标注数据集来估计LLMs响应的不确定性。基于这一形式化定义,我们阐明了LLMs与标准机器学习模型在不确定性估计上的差异,并解释了为何LLMs的隐藏激活包含不确定性信息。我们设计的方法有效展示了利用隐藏激活在不同任务中增强不确定性估计的优势,并在分布外场景中表现出稳健的迁移性。此外,我们区分了不确定性估计任务与不确定性校准任务,并证明更优的不确定性估计模式能带来更好的校准性能。实践中,该方法易于实现且适用于不同模型透明度级别(包括黑箱、灰箱和白箱),基于LLM内部机制的可访问性,各场景均展现出强劲性能。