Existing pre-training methods for extractive Question Answering (QA) generate cloze-like queries different from natural questions in syntax structure, which could overfit pre-trained models to simple keyword matching. In order to address this problem, we propose a novel Momentum Contrastive pRe-training fOr queStion anSwering (MCROSS) method for extractive QA. Specifically, MCROSS introduces a momentum contrastive learning framework to align the answer probability between cloze-like and natural query-passage sample pairs. Hence, the pre-trained models can better transfer the knowledge learned in cloze-like samples to answering natural questions. Experimental results on three benchmarking QA datasets show that our method achieves noticeable improvement compared with all baselines in both supervised and zero-shot scenarios.
翻译:现有的抽取式问答预训练方法生成的类完形查询语句在句法结构上不同于自然问句,这可能导致预训练模型过度拟合于简单的关键词匹配。为解决此问题,我们提出了一种新颖的面向抽取式问答的动量对比预训练方法(MCROSS)。具体而言,MCROSS引入动量对比学习框架,用于对齐类完形与自然查询-段落样本对之间的答案概率,从而使预训练模型能够更好地将类完形样本中习得的知识迁移至自然问句的回答。在三个基准问答数据集上的实验结果表明,我们的方法在监督学习和零样本场景下均实现了相较于所有基线方法的显著改进。