Counterfactually-Augmented Data (CAD) has the potential to improve language models' Out-Of-Distribution (OOD) generalization capability, as CAD induces language models to exploit causal features and exclude spurious correlations. However, the empirical results of OOD generalization on CAD are not as efficient as expected. In this paper, we attribute the inefficiency to Myopia Phenomenon caused by CAD: language models only focus on causal features that are edited in the augmentation and exclude other non-edited causal features. As a result, the potential of CAD is not fully exploited. Based on the structural properties of CAD, we design two additional constraints to help language models extract more complete causal features contained in CAD, thus improving the OOD generalization capability. We evaluate our method on two tasks: Sentiment Analysis and Natural Language Inference, and the experimental results demonstrate that our method could unlock CAD's potential and improve language models' OOD generalization capability.
翻译:反事实增强数据(CAD)具有提升语言模型分布外(OOD)泛化能力的潜力,因为它能引导语言模型利用因果特征并排除虚假关联。然而,基于CAD的OOD泛化实证结果并未达到预期效果。在本文中,我们将这种低效归因于CAD导致的"短视现象":语言模型仅关注增强过程中被编辑的因果特征,而忽略其他未被编辑的因果特征。这使得CAD的潜能未能被充分挖掘。基于CAD的结构特性,我们设计了两种额外约束条件,帮助语言模型提取CAD中包含的更完整的因果特征,从而提升OOD泛化能力。我们在情感分析和自然语言推理两项任务上评估了该方法,实验结果表明,我们的方法能够释放CAD的潜能,提升语言模型的OOD泛化能力。