Natural Language Inference (NLI) has been an important task for evaluating language models for Natural Language Understanding, but the logical properties of the task are poorly understood and often mischaracterized. Understanding the notion of inference captured by NLI is key to interpreting model performance on the task. In this paper we formulate three possible readings of the NLI label set and perform a comprehensive analysis of the meta-inferential properties they entail. Focusing on the SNLI dataset, we exploit (1) NLI items with shared premises and (2) items generated by LLMs to evaluate models trained on SNLI for meta-inferential consistency and derive insights into which reading of the logical relations is encoded by the dataset.
翻译:自然语言推理(NLI)长期以来作为评估自然语言理解语言模型的重要任务,但其逻辑特性尚未得到充分理解且常被误判。理解NLI所捕捉的推理概念是解读模型在该任务上表现的关键。本文针对NLI标签集提出三种可能的解读方式,并对其蕴含的元推理特性进行全面分析。聚焦于SNLI数据集,我们通过利用(1)共享前提的NLI样本与(2)由大语言模型生成的样本,评估基于SNLI训练的模型在元推理一致性方面的表现,从而揭示数据集中编码的逻辑关系解读方式。