Pre-trained language models (PLMs) have achieved impressive results on various natural language processing tasks. However, recent research has revealed that these models often rely on superficial features and shortcuts instead of developing a genuine understanding of language, especially for natural language understanding (NLU) tasks. Consequently, the models struggle to generalize to out-of-domain data. In this work, we propose Divergence Based Regularization (DBR) to mitigate this shortcut learning behavior. Our method measures the divergence between the output distributions for original examples and examples where shortcut tokens have been masked. This process prevents the model's predictions from being overly influenced by shortcut features or biases. We evaluate our model on three NLU tasks and find that it improves out-of-domain performance with little loss of in-domain accuracy. Our results demonstrate that reducing the reliance on shortcuts and superficial features can enhance the generalization ability of large pre-trained language models.
翻译:预训练语言模型(PLM)在各种自然语言处理任务中取得了令人瞩目的成果。然而,近期研究表明,这些模型往往依赖表层特征和捷径,而非真正理解语言,尤其在自然语言理解(NLU)任务中。因此,模型难以泛化至域外数据。本研究提出基于散度的正则化(DBR)方法来缓解这种捷径学习行为。我们的方法通过计算原始样本与掩码捷径词元后样本的输出分布之间的散度,防止模型预测过度受捷径特征或偏见的影响。我们在三个NLU任务上评估模型,发现该方法在几乎不损失域内准确性的同时提升了域外性能。实验结果表明,降低对捷径和表层特征的依赖能够增强大型预训练语言模型的泛化能力。