Reducing and detecting hallucinations in large language models is an open research problem. In this project, we attempt to leverage recent advances in the field of uncertainty estimation to reduce hallucinations in frozen large language models. Epistemic neural networks have recently been proposed to improve output joint distributions for large pre-trained models. ENNs are small networks attached to large, frozen models to improve the model's joint distributions and uncertainty estimates. In this work, we train an epistemic neural network on top of the Llama-2 7B model combined with a contrastive decoding feature enhancement technique. We are the first to train an ENN for the next token prediction task and explore the efficacy of this method in reducing hallucinations on the TruthfulQA dataset. In essence, we provide a method that leverages a pre-trained model's latent embeddings to reduce hallucinations.
翻译:减少和检测大语言模型中的幻觉是一个开放的研究问题。在本项目中,我们尝试利用不确定性估计领域的最新进展来减少冻结大语言模型的幻觉。认知神经网络(Epistemic neural networks, ENNs)近期被提出用于改进大型预训练模型的输出联合分布。ENNs是附加在大型冻结模型上的小型网络,旨在改善模型的联合分布和不确定性估计。在本工作中,我们在Llama-2 7B模型之上训练了一个认知神经网络,并结合了对比解码特征增强技术。我们是首个针对下一个词元预测任务训练ENN的研究团队,并在TruthfulQA数据集上探索了该方法在减少幻觉方面的有效性。本质上,我们提供了一种利用预训练模型潜在嵌入来减少幻觉的方法。