Machine Reading Comprehension (MRC) models tend to take advantage of spurious correlations (also known as dataset bias or annotation artifacts in the research community). Consequently, these models may perform the MRC task without fully comprehending the given context and question, which is undesirable since it may result in low robustness against distribution shift. The main focus of this paper is answer-position bias, where a significant percentage of training questions have answers located solely in the first sentence of the context. We propose a Single-Sentence Reader as a new approach for addressing answer position bias in MRC. Remarkably, in our experiments with six different models, our proposed Single-Sentence Readers trained on biased dataset achieve results that nearly match those of models trained on normal dataset, proving their effectiveness in addressing the answer position bias. Our study also discusses several challenges our Single-Sentence Readers encounter and proposes a potential solution.
翻译:机器阅读理解模型倾向于利用虚假相关性(研究领域中也称为数据集偏差或标注伪影)。因此,这些模型可能在未完全理解给定上下文和问题的情况下执行机器阅读理解任务,这并不理想,因为可能导致模型对分布偏移的鲁棒性较低。本文主要关注答案位置偏差,即训练问题中相当大比例的问题其答案仅出现在上下文的第一句中。我们提出单句阅读器作为应对机器阅读理解中答案位置偏差的新方法。值得注意的是,在六种不同模型的实验中,我们基于有偏数据集训练的单句阅读器所取得的结果几乎与基于正常数据集训练的模型结果相当,证明了其应对答案位置偏差的有效性。我们的研究还讨论了单句阅读器遇到的若干挑战,并提出了一种潜在的解决方案。