Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges. A predominant issue is the propensity for these models to generate non-existent facts, a concern termed hallucination. Our research is motivated by the observation that previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not. When the question is out of the parametric knowledge, it will try to make up something and fail to indicate when it lacks knowledge. In this paper, we present a new approach called Refusal-Aware Instruction Tuning (R-Tuning). This approach is formalized by first identifying the disparity in knowledge encompassed by pre-trained parameters compared to that of instruction tuning data. Then, we construct the refusal-aware data based on the knowledge intersection, to tune LLMs to refrain from responding to questions beyond its parametric knowledge. Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions. Furthermore, when tested on out-of-domain datasets, the refusal ability was found to be a meta-skill that could be generalized to other tasks. Further analysis surprisingly finds that learning the uncertainty results in better calibration and an improved ability to estimate the uncertainty than uncertainty-based testing. Our code is available at https://github.com/shizhediao/R-Tuning.
翻译:大型语言模型(LLMs)凭借其卓越性能已革新众多领域,但仍面临挑战。一个主要问题是这些模型倾向于生成不存在的虚构事实,即幻觉现象。我们的研究源于以下观察:先前的指令微调方法强制模型无论是否掌握相关知识都必须完成句子。当问题超出参数知识范围时,模型会试图编造答案,而无法表明自身知识缺失。本文提出一种名为“拒答感知指令微调(R-Tuning)”的新方法。该方法首先识别预训练参数与指令微调数据之间知识范围的差异,然后基于知识交集构建拒答感知数据,用以微调LLMs使其拒绝对超出参数知识范围的问题作出回应。实验结果表明,R-Tuning能有效提升模型回答已知问题的能力,并避免回答未知问题。此外,在域外数据集上的测试发现,拒答能力是一种可泛化到其他任务的元技能。进一步分析令人惊讶地发现:与基于不确定性的测试相比,学习不确定性可使模型校准效果更好,且不确定性估计能力更强。我们的代码已开源:https://github.com/shizhediao/R-Tuning。