Large language models (LLMs) have achieved state-of-the-art performance on a series of natural language understanding tasks. However, these LLMs might rely on dataset bias and artifacts as shortcuts for prediction. This has significantly affected their generalizability and adversarial robustness. In this paper, we provide a review of recent developments that address the shortcut learning and robustness challenge of LLMs. We first introduce the concepts of shortcut learning of language models. We then introduce methods to identify shortcut learning behavior in language models, characterize the reasons for shortcut learning, as well as introduce mitigation solutions. Finally, we discuss key research challenges and potential research directions in order to advance the field of LLMs.
翻译:大型语言模型已在诸多自然语言理解任务中取得了最先进的性能。然而,这些模型可能依赖数据集偏差和人工产物作为预测的捷径,这严重影响了其泛化能力和对抗鲁棒性。本文综述了近年来针对大型语言模型捷径学习与鲁棒性挑战的研究进展。我们首先介绍语言模型捷径学习的概念,随后阐述识别语言模型中捷径学习行为的方法、分析捷径学习的成因机制,并介绍相应的缓解策略。最后,我们讨论该领域面临的关键研究挑战与潜在研究方向,以推动大型语言模型的进一步发展。