The development of LLMs has greatly enhanced the intelligence and fluency of question answering, while the emergence of retrieval enhancement has enabled models to better utilize external information. However, the presence of noise and errors in retrieved information poses challenges to the robustness of LLMs. In this work, to evaluate the model's performance under multiple interferences, we first construct a dataset based on machine reading comprehension datasets simulating various scenarios, including critical information absence, noise, and conflicts. To address the issue of model accuracy decline caused by noisy external information, we propose a data augmentation-based fine-tuning method to enhance LLM's robustness against noise. Additionally, contrastive learning approach is utilized to preserve the model's discrimination capability of external information. We have conducted experiments on both existing LLMs and our approach, the results are evaluated by GPT-4, which indicates that our proposed methods improve model robustness while strengthening the model's discrimination capability.
翻译:大语言模型的发展显著提升了问答系统的智能性与流畅度,而检索增强技术的出现则使模型能够更好地利用外部信息。然而,检索信息中存在的噪声与错误对大语言模型的鲁棒性提出了挑战。本研究为评估模型在多重干扰下的性能,首先基于机器阅读理解数据集构建了一个模拟多种场景(包括关键信息缺失、噪声及冲突)的数据集。针对噪声外部信息导致模型准确率下降的问题,我们提出了一种基于数据增强的微调方法,以增强大语言模型对噪声的鲁棒性。此外,通过采用对比学习方法,保持了模型对外部信息的判别能力。我们在现有大语言模型及本文方法上进行了实验,结果经GPT-4评估表明,所提出的方法在提升模型鲁棒性的同时,强化了其信息判别能力。