Large Language Models (LLMs) are known to hallucinate, whereby they generate plausible but inaccurate text. This phenomenon poses significant risks in critical applications, such as medicine or law, necessitating robust hallucination mitigation strategies. While recent works have proposed fine-tuning methods to teach LLMs to abstain from answering questions beyond their knowledge or capabilities, these methods rely on the existence of ground-truth labels or are limited to short-form responses. To address these limitations, we propose fine-tuning using semantic entropy, an uncertainty measure derived from introspection into the model which does not require external labels. We demonstrate that our approach matches or outperforms models fine-tuned using prior work and achieves strong performance for both short and long-form generations on a range of datasets.
翻译:大语言模型(LLMs)存在产生幻觉的问题,即生成看似合理但不准确的文本。这种现象在医疗或法律等关键应用中构成重大风险,因此需要有效的幻觉缓解策略。尽管近期研究提出了通过微调方法教导LLMs对超出其知识或能力范围的问题选择弃答,但这些方法依赖于真实标签的存在或仅适用于短文本回复。为克服这些局限,我们提出利用语义熵进行微调——这是一种通过模型自省获得、无需外部标签的不确定性度量方法。实验表明,该方法在多个数据集上的表现与现有微调方法相当或更优,且在短文本和长文本生成任务中均展现出强劲性能。