We present an automatic large language model (LLM) conversion approach that produces uncertainty-aware LLMs capable of estimating uncertainty with every prediction. Our approach is model- and data-agnostic, is computationally-efficient, and does not rely on external models or systems. We evaluate converted models on the selective question answering setting -- to answer as many questions as possible while maintaining a given accuracy, forgoing providing predictions when necessary. As part of our results, we test BERT and Llama 2 model variants on the SQuAD extractive QA task and the TruthfulQA generative QA task. We show that using the uncertainty estimates provided by our approach to selectively answer questions leads to significantly higher accuracy over directly using model probabilities.
翻译:我们提出了一种自动化的大语言模型(LLM)转换方法,可生成能对每次预测进行不确定性估计的LLM。该方法具有模型与数据无关性、计算高效性,且不依赖外部模型或系统。我们在选择性问答场景中评估了转换后的模型——即在保持指定准确率的前提下尽可能多地回答问题,必要时放弃提供预测。作为实验结果的一部分,我们在SQuAD抽取式问答任务和TruthfulQA生成式问答任务上测试了BERT与Llama 2变体模型。结果表明,利用本方法提供的不确定性估计进行选择性问答,其准确率显著高于直接使用模型概率的方法。