While Large Language Models (LLMs) demonstrate impressive capabilities, they still struggle with generating factually incorrect content (i.e., hallucinations). A promising approach to mitigate this issue is enabling models to express uncertainty when unsure. Previous research on uncertainty modeling has primarily focused on short-form QA, but realworld applications often require much longer responses. In this work, we introduce the task of Long-form Generation with Uncertainty(LoGU). We identify two key challenges: Uncertainty Suppression, where models hesitate to express uncertainty, and Uncertainty Misalignment, where models convey uncertainty inaccurately. To tackle these challenges, we propose a refinement-based data collection framework and a two-stage training pipeline. Our framework adopts a divide-and-conquer strategy, refining uncertainty based on atomic claims. The collected data are then used in training through supervised fine-tuning (SFT) and direct preference optimization (DPO) to enhance uncertainty expression. Extensive experiments on three long-form instruction following datasets show that our method significantly improves accuracy, reduces hallucinations, and maintains the comprehensiveness of responses.
翻译:尽管大型语言模型(LLM)展现出令人印象深刻的能力,但在生成事实上不正确的内容(即幻觉)方面仍存在困难。缓解这一问题的一种可行方法是使模型在不确定时能够表达不确定性。先前关于不确定性建模的研究主要集中于短文本问答,但实际应用通常需要更长的响应。在本工作中,我们引入了具有不确定性表达的长文本生成(LoGU)任务。我们识别出两个关键挑战:不确定性抑制(模型不愿表达不确定性)和不确定性错位(模型不准确地传达不确定性)。为应对这些挑战,我们提出了一个基于精化的数据收集框架和一个两阶段训练流程。该框架采用分治策略,基于原子声明精化不确定性表达。收集的数据随后通过监督微调(SFT)和直接偏好优化(DPO)用于训练,以增强不确定性表达能力。在三个长文本指令遵循数据集上的大量实验表明,我们的方法显著提高了准确性,减少了幻觉,并保持了回答的全面性。