Natural Language Inference (NLI) has been an important task for evaluating language models for Natural Language Understanding, but the logical properties of the task are poorly understood and often mischaracterized. Understanding the notion of inference captured by NLI is key to interpreting model performance on the task. In this paper we formulate three possible readings of the NLI label set and perform a comprehensive analysis of the meta-inferential properties they entail. Focusing on the SNLI dataset, we exploit (1) NLI items with shared premises and (2) items generated by LLMs to evaluate models trained on SNLI for meta-inferential consistency and derive insights into which reading of the logical relations is encoded by the dataset.
翻译:自然语言推理(NLI)一直是评估自然语言理解语言模型的重要任务,但对该任务的逻辑特性理解不足且常被误读。理解NLI所捕捉的推理概念是解释模型在该任务上表现的关键。本文提出了NLI标签集的三种可能解读,并对它们所蕴含的元推理特性进行了全面分析。聚焦于SNLI数据集,我们利用(1)具有共享前提的NLI条目,以及(2)由LLM生成的条目,评估在SNLI上训练的模型的元推理一致性,并深入探究数据集编码了哪种逻辑关系解读。