Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options. Answers in multi-choice MRC cannot be directly extracted in the given passages, and essentially require machines capable of reasoning from accurate extracted evidence. However, the critical evidence may be as simple as just one word or phrase, while it is hidden in the given redundant, noisy passage with multiple linguistic hierarchies from phrase, fragment, sentence until the entire passage. We thus propose a novel general-purpose model enhancement which integrates multi-grained evidence comprehensively, named Multi-grained evidence inferencer (Mugen), to make up for the inability. Mugen extracts three different granularities of evidence: coarse-, middle- and fine-grained evidence, and integrates evidence with the original passages, achieving significant and consistent performance improvement on four multi-choice MRC benchmarks.
翻译:多项选择机器阅读理解要求机器根据提供的选项回答问题,是一项重大而富有挑战性的任务。此类任务中的答案无法直接从给定段落中提取,本质上需要机器具备根据精准提取的证据进行推理的能力。然而,关键证据可能简单到仅为一个词或短语,却隐藏在给定冗余、嘈杂的段落中,跨越短语、片段、句子直至整个段落的多个语言层级。为此,我们提出了一种新颖的通用型模型增强方法——多粒度证据推理器(Mugen),通过全面整合多粒度证据来弥补这一不足。Mugen提取三种不同粒度的证据:粗粒度、中粒度和细粒度证据,并将证据与原始段落整合,在四个多项选择MRC基准测试上实现了显著且一致的性能提升。