In recent years, large language models have achieved state-of-the-art performance across various NLP tasks. However, investigations have shown that these models tend to rely on shortcut features, leading to inaccurate predictions and causing the models to be unreliable at generalization to out-of-distribution (OOD) samples. For instance, in the context of relation extraction (RE), we would expect a model to identify the same relation independently of the entities involved in it. For example, consider the sentence "Leonardo da Vinci painted the Mona Lisa" expressing the created(Leonardo_da_Vinci, Mona_Lisa) relation. If we substiute "Leonardo da Vinci" with "Barack Obama", then the sentence still expresses the created relation. A robust model is supposed to detect the same relation in both cases. In this work, we describe several semantically-motivated strategies to generate adversarial examples by replacing entity mentions and investigate how state-of-the-art RE models perform under pressure. Our analyses show that the performance of these models significantly deteriorates on the modified datasets (avg. of -48.5% in F1), which indicates that these models rely to a great extent on shortcuts, such as surface forms (or patterns therein) of entities, without making full use of the information present in the sentences.
翻译:近年来,大型语言模型在各种自然语言处理任务中取得了最先进的性能。然而,研究表明这些模型倾向于依赖捷径特征,导致预测不准确,且模型难以泛化到分布外样本。例如,在关系抽取(RE)场景中,我们期望模型能够独立于所涉及的实体识别同一关系。以句子"列奥纳多·达·芬奇创作了《蒙娜丽莎》"为例,该句表达了created(Leonardo_da_Vinci, Mona_Lisa)关系。若将"列奥纳多·达·芬奇"替换为"巴拉克·奥巴马",句子仍然表达created关系。一个鲁棒的模型应能在两种情况下检测到同一关系。本文描述了多种基于语义激励的策略,通过替换实体提及生成对抗样本,并研究最先进的RE模型在压力下的表现。分析表明,这些模型在修改后的数据集上性能显著下降(F1平均降低48.5%),这显示模型在很大程度上依赖实体表面形式(或其中的模式)等捷径,而未充分利用句子中的完整信息。