Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges. A predominant issue is the propensity for these models to generate non-existent facts, a concern termed hallucination. Our research is motivated by the observation that previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not. When the question is out of the parametric knowledge, it will try to make up something and fail to indicate when it lacks knowledge. In this paper, we present a new approach called Refusal-Aware Instruction Tuning (R-Tuning). This approach is formalized by first identifying the knowledge gap between parametric knowledge and the instruction tuning data. Then, we construct the refusal-aware data based on the knowledge intersection, to tune LLMs to refrain from responding to questions beyond its parametric knowledge. Experimental results demonstrate this new instruction tuning approach effectively improves a model's ability to answer known questions and refrain from answering unknown questions. Furthermore, when tested on out-of-domain datasets, the refusal ability was found to be a meta-skill that could be generalized to other tasks. Further analysis surprisingly finds that learning the uncertainty during training displays a better ability to estimate uncertainty than uncertainty-based testing. Our code will be released at https://github.com/shizhediao/R-Tuning.
翻译:大型语言模型(LLMs)凭借其卓越性能在众多领域引发了革命性进展,但仍面临诸多挑战。一个主要问题是这些模型倾向于生成不存在的事实,即所谓的"幻觉"现象。我们的研究源于以下观察:以往的指令微调方法强制模型无论是否掌握相关知识都要完成句子生成。当问题超出参数知识范围时,模型会尝试编造内容,而无法表明自身知识不足。本文提出了一种名为"拒绝感知指令微调(R-Tuning)"的新方法。该方法首先识别参数知识与指令微调数据之间的知识差距,然后基于知识交集构建拒绝感知数据,以调整LLMs使其拒绝回答超出参数知识范围的问题。实验结果表明,这种新型指令微调方法能有效提升模型回答已知问题以及拒绝未知问题的能力。此外,在跨领域数据集测试中,发现拒绝能力是一种可迁移至其他任务的元技能。进一步分析令人惊讶地发现,在训练过程中学习不确定性比基于不确定性的测试能更好地估计不确定性。我们的代码将发布于 https://github.com/shizhediao/R-Tuning。