Explainable natural language inference aims to provide a mechanism to produce explanatory (abductive) inference chains which ground claims to their supporting premises. A recent corpus called EntailmentBank strives to advance this task by explaining the answer to a question using an entailment tree \cite{dalvi2021explaining}. They employ the T5 model to directly generate the tree, which can explain how the answer is inferred. However, it lacks the ability to explain and control the generation of intermediate steps, which is crucial for the multi-hop inference process. % One recent corpus, EntailmentBank, aims to push this task forward by explaining an answer to a question according to an entailment tree \cite{dalvi2021explaining}. They employ T5 to generate the tree directly, which can explain how the answer is inferred but cannot explain how the intermediate is generated, which is essential to the multi-hop inference process. In this work, we focus on proposing a controlled natural language inference architecture for multi-premise explanatory inference. To improve control and enable explanatory analysis over the generation, we define lexical inference types based on Abstract Meaning Representation (AMR) graph and modify the architecture of T5 to learn a latent sentence representation (T5 bottleneck) conditioned on said type information. We also deliver a dataset of approximately 5000 annotated explanatory inference steps, with well-grounded lexical-symbolic operations. Experimental results indicate that the inference typing induced at the T5 bottleneck can help T5 to generate a conclusion under explicit control.
翻译:可解释的自然语言推理旨在提供一种机制,生成将结论与其支撑前提相联系的解释性(溯因)推理链。近期发布的EntailmentBank语料库通过使用蕴含树来解释问题的答案,推动了该任务的发展\cite{dalvi2021explaining}。该研究采用T5模型直接生成蕴含树,以阐释答案的推理过程。然而,该方法缺乏对中间推理步骤的生成进行解释和控制的能力,而这对多跳推理过程至关重要。本文聚焦于提出一种面向多前提解释性推理的可控自然语言推理架构。为提升生成过程的可控性并支持解释性分析,我们基于抽象意义表示(AMR)图定义了词汇推理类型,并修改T5模型架构,使其在特定类型信息约束下学习潜在句子表征(T5瓶颈层)。同时构建了包含约5000个标注解释性推理步骤的数据集,其中蕴含了可靠的词汇-符号操作。实验结果表明,在T5瓶颈层引入的推理类型标注能够有效辅助T5模型在显式控制下生成结论。