Contradictory results about the encoding of the semantic impact of negation in pretrained language models (PLMs). have been drawn recently (e.g. Kassner and Sch{\"u}tze (2020); Gubelmann and Handschuh (2022)). In this paper we focus rather on the way PLMs encode negation and its formal impact, through the phenomenon of the Negative Polarity Item (NPI) licensing in English. More precisely, we use probes to identify which contextual representations best encode 1) the presence of negation in a sentence, and 2) the polarity of a neighboring masked polarity item. We find that contextual representations of tokens inside the negation scope do allow for (i) a better prediction of the presence of not compared to those outside the scope and (ii) a better prediction of the right polarity of a masked polarity item licensed by not, although the magnitude of the difference varies from PLM to PLM. Importantly, in both cases the trend holds even when controlling for distance to not. This tends to indicate that the embeddings of these models do reflect the notion of negation scope, and do encode the impact of negation on NPI licensing. Yet, further control experiments reveal that the presence of other lexical items is also better captured when using the contextual representation of a token within the same syntactic clause than outside from it, suggesting that PLMs simply capture the more general notion of syntactic clause.
翻译:关于预训练语言模型(PLMs)如何编码否定的语义影响,近期研究得出了相互矛盾的结论(例如 Kassner 和 Schütze (2020);Gubelmann 和 Handschuh (2022))。本文则聚焦于PLMs编码否定及其形式影响的方式,通过英语中否定极性项(NPI)允准这一现象进行研究。具体而言,我们使用探针来识别哪些上下文表征能最佳地编码:1)句子中否定的存在;2)邻近被掩码的极性项的正确极性。研究发现,与否定域外的词元相比,否定域内词元的上下文表征确实能够:(i)更好地预测not的存在;(ii)更好地预测由not允准的被掩码极性项的正确极性,尽管这种差异的程度因PLM而异。重要的是,在两种情况下,即使控制了与not的距离,这一趋势依然成立。这倾向于表明这些模型的嵌入确实反映了否定域的概念,并编码了否定对NPI允准的影响。然而,进一步的对照实验揭示,当使用同一句法从句内(而非从句外)词元的上下文表征时,对其他词汇项存在的捕捉也更为准确,这表明PLMs仅仅捕捉到了更为一般的句法从句概念。