Trustworthy language models should abstain from answering questions when they do not know the answer. However, the answer to a question can be unknown for a variety of reasons. Prior research has focused on the case in which the question is clear and the answer is unambiguous but possibly unknown, but the answer to a question can also be unclear due to uncertainty of the questioner's intent or context. We investigate question answering from this perspective, focusing on answering a subset of questions with a high degree of accuracy, from a set of questions in which many are inherently ambiguous. In this setting, we find that the most reliable approach to decide when to abstain involves quantifying repetition within sampled model outputs, rather than the model's likelihood or self-verification as used in prior work. We find this to be the case across different types of uncertainty and model scales,and with or without instruction tuning. Our results suggest that sampling-based confidence scores help calibrate answers to relatively unambiguous questions, with more dramatic improvements on ambiguous questions.
翻译:可信赖的语言模型在不知道答案时应避免回答问题。然而,问题答案未知的原因可能多种多样。先前研究主要关注问题明确且答案清晰但可能未知的情况,但问题答案也可能因提问者意图或语境的不确定性而模糊不清。本研究从这一视角探讨问答任务,聚焦于从大量本质上模糊的问题集合中,以高精度回答部分子集问题。在此场景下,我们发现决定何时放弃回答的最可靠方法是量化模型输出样本中的重复性,而非采用先前工作中使用的模型似然性或自我验证方法。这一结论在不同类型的不确定性、模型规模以及是否经过指令微调的情况下均成立。实验结果表明,基于采样的置信度分数有助于校准相对明确问题的答案,而对模糊问题的改进效果更为显著。