The task of natural language inference (NLI) asks whether a given premise (expressed in NL) entails a given NL hypothesis. NLI benchmarks contain human ratings of entailment, but the meaning relationships driving these ratings are not formalized. Can the underlying sentence pair relationships be made more explicit in an interpretable yet robust fashion? We compare semantic structures to represent premise and hypothesis, including sets of contextualized embeddings and semantic graphs (Abstract Meaning Representations), and measure whether the hypothesis is a semantic substructure of the premise, utilizing interpretable metrics. Our evaluation on three English benchmarks finds value in both contextualized embeddings and semantic graphs; moreover, they provide complementary signals, and can be leveraged together in a hybrid model.
翻译:自然语言推理任务要求判断给定前提(以自然语言表述)是否蕴含给定的自然语言假设。NLI基准数据集包含人类对蕴含关系的标注评分,但驱动这些评分背后的语义关系尚未被形式化。能否以一种既具可解释性又保持稳健性的方式,更明确地揭示底层句子对关系?我们比较了用于表示前提和假设的语义结构,包括上下文嵌入向量集合与语义图(抽象语义表示),并通过可解释度量指标衡量假设是否为前提的语义子结构。在三个英文基准上的评估结果表明,上下文嵌入向量与语义图均具有价值;此外,这两类结构能提供互补信号,可在混合模型中协同利用。