We introduce a method to measure uncertainty in large language models. For tasks like question answering, it is essential to know when we can trust the natural language outputs of foundation models. We show that measuring uncertainty in natural language is challenging because of `semantic equivalence' -- different sentences can mean the same thing. To overcome these challenges we introduce semantic entropy -- an entropy which incorporates linguistic invariances created by shared meanings. Our method is unsupervised, uses only a single model, and requires no modifications to `off-the-shelf' language models. In comprehensive ablation studies we show that the semantic entropy is more predictive of model accuracy on question answering data sets than comparable baselines.
翻译:我们提出一种衡量大语言模型不确定性的方法。对于问答等任务而言,了解何时可以信任基础模型的自然语言输出至关重要。我们揭示出,由于“语义等价性”——不同句子可能表达相同含义——测量自然语言中的不确定性颇具挑战性。为应对这些挑战,我们引入语义熵——一种融合了由共享意义所创造的语言不变性的熵。我们的方法无需监督,仅使用单一模型,且无需对“现成”语言模型进行任何修改。在全面的消融研究中,我们证明语义熵相比对照基线方法,能更有效地预测模型在问答数据集上的准确性。