Multi-choice Machine Reading Comprehension (MRC) is a challenging extension of Natural Language Processing (NLP) that requires the ability to comprehend the semantics and logical relationships between entities in a given text. The MRC task has traditionally been viewed as a process of answering questions based on the given text. This single-stage approach has often led the network to concentrate on generating the correct answer, potentially neglecting the comprehension of the text itself. As a result, many prevalent models have faced challenges in performing well on this task when dealing with longer texts. In this paper, we propose a two-stage knowledge distillation method that teaches the model to better comprehend the document by dividing the MRC task into two separate stages. Our experimental results show that the student model, when equipped with our method, achieves significant improvements, demonstrating the effectiveness of our method.
翻译:多选机器阅读理解(MRC)是自然语言处理(NLP)中一项具有挑战性的扩展任务,要求具备理解给定文本中实体间语义与逻辑关系的能力。传统上,MRC任务被视为基于给定文本回答问题的过程。这种单阶段方法往往使网络专注于生成正确答案,可能忽视了对文本本身的理解。因此,许多主流模型在处理长文本时难以在该任务上取得良好表现。本文提出了一种两阶段知识蒸馏方法,通过将MRC任务分解为两个独立阶段,教会模型更好地理解文档。实验结果表明,采用我们方法的学生模型取得了显著提升,验证了该方法的有效性。