We study the feasibility of identifying epistemic uncertainty (reflecting a lack of knowledge), as opposed to aleatoric uncertainty (reflecting entropy in the underlying distribution), in the outputs of large language models (LLMs) over free-form text. In the absence of ground-truth probabilities, we explore a setting where, in order to (approximately) disentangle a given LLM's uncertainty, a significantly larger model stands in as a proxy for the ground truth. We show that small linear probes trained on the embeddings of frozen, pretrained models accurately predict when larger models will be more confident at the token level and that probes trained on one text domain generalize to others. Going further, we propose a fully unsupervised method that achieves non-trivial accuracy on the same task. Taken together, we interpret these results as evidence that LLMs naturally contain internal representations of different types of uncertainty that could potentially be leveraged to devise more informative indicators of model confidence in diverse practical settings.
翻译:我们研究了识别大型语言模型(LLM)在自由文本输出中的认知不确定性(反映知识缺乏)与偶然不确定性(反映底层分布中的熵)的可行性。在缺乏真实概率的情况下,我们探索了一种设置:为了(近似)解耦给定LLM的不确定性,一个显著更大的模型被用作真实概率的代理。我们证明,在冻结的预训练模型嵌入上训练的小型线性探针能够准确预测较大模型在词元级别上何时更自信,且在一个文本领域训练的探针可泛化至其他领域。进一步地,我们提出了一种完全无监督的方法,在该任务上实现了非平凡的准确率。综合来看,我们将这些结果解释为:LLM自然包含不同类型不确定性的内部表征,这可能被利用来设计更信息性的模型置信度指标,适用于多样化的实际场景。