Estimating the Causal Effects of Natural Logic Features in Transformer-Based NLI Models

Rigorous evaluation of the causal effects of semantic features on language model predictions can be hard to achieve for natural language reasoning problems. However, this is such a desirable form of analysis from both an interpretability and model evaluation perspective, that it is valuable to investigate specific patterns of reasoning with enough structure and regularity to identify and quantify systematic reasoning failures in widely-used models. In this vein, we pick a portion of the NLI task for which an explicit causal diagram can be systematically constructed: the case where across two sentences (the premise and hypothesis), two related words/terms occur in a shared context. In this work, we apply causal effect estimation strategies to measure the effect of context interventions (whose effect on the entailment label is mediated by the semantic monotonicity characteristic) and interventions on the inserted word-pair (whose effect on the entailment label is mediated by the relation between these words). Extending related work on causal analysis of NLP models in different settings, we perform an extensive interventional study on the NLI task to investigate robustness to irrelevant changes and sensitivity to impactful changes of Transformers. The results strongly bolster the fact that similar benchmark accuracy scores may be observed for models that exhibit very different behaviour. Moreover, our methodology reinforces previously suspected biases from a causal perspective, including biases in favour of upward-monotone contexts and ignoring the effects of negation markers.

翻译：对于自然语言推理问题而言，严格评估语义特征对语言模型预测的因果效应往往难以实现。然而，从可解释性和模型评估两个角度来看，这种分析形式都极具价值，因此有必要研究具有足够结构和规律性的特定推理模式，以识别并量化广泛使用模型中的系统性推理失败。基于此，我们选取了NLI任务中可系统构建显式因果图的部分：即跨两个句子（前提和假设）出现两个相关词语/术语处于共享语境的情况。本研究应用因果效应估计策略，测量语境干预（其对蕴含标签的影响通过语义单调性特征中介）和插入词语对干预（其对蕴含标签的影响通过词语间关系中介）的效应。通过拓展不同场景下NLP模型因果分析的相关工作，我们对NLI任务开展了广泛的干预研究，以探究Transformer对无关变化的鲁棒性及对关键变化的敏感性。结果有力证实，表现截然不同的模型可能观测到相似的基准准确率。此外，我们的方法论从因果视角强化了先前疑似的偏差，包括对向上单调语境的偏好以及忽略否定标记影响的偏差。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/