Large Language Models (LLMs), including ChatGPT and LLaMA, are susceptible to generating hallucinated answers in a confident tone. While efforts to elicit and calibrate confidence scores have proven useful, recent findings show that controlling uncertainty must go beyond calibration: predicted scores may deviate significantly from the actual posterior probabilities due to the impact of grouping loss. In this work, we construct a new evaluation dataset derived from a knowledge base to assess confidence scores given to answers of Mistral and LLaMA. Experiments show that they tend to be overconfident. Further, we show that they are more overconfident on some answers than others, \emph{eg} depending on the nationality of the person in the query. In uncertainty-quantification theory, this is grouping loss. To address this, we propose a solution to reconfidence LLMs, canceling not only calibration but also grouping loss. The LLMs, after the reconfidencing process, indicate improved confidence alignment with the accuracy of their responses.
翻译:大语言模型(包括ChatGPT和LLaMA)易以自信语气生成幻觉式答案。尽管提示与校准置信分数的方法已被证明有效,但最新研究表明,控制不确定性需超越校准范畴:由于分组损失的影响,预测分数可能显著偏离真实后验概率。本研究基于知识库构建新评估数据集,用于评估Mistral和LLaMA对答案的置信度评分。实验表明,这些模型存在过度自信倾向。进一步发现,模型对不同答案的过度自信程度存在差异——例如取决于查询中人物的国籍。在不确定性量化理论中,这被称为分组损失。为应对该问题,我们提出一种重校准大语言模型置信度的解决方案,同时消除校准偏差与分组损失。经过置信度重校准后的大语言模型,其置信度与回答准确率展现出更优的一致性。