Pre-trained Generative models such as BART, T5, etc. have gained prominence as a preferred method for text generation in various natural language processing tasks, including abstractive long-form question answering (QA) and summarization. However, the potential of generative models in extractive QA tasks, where discriminative models are commonly employed, remains largely unexplored. Discriminative models often encounter challenges associated with label sparsity, particularly when only a small portion of the context contains the answer. The challenge is more pronounced for multi-span answers. In this work, we introduce a novel approach that uses the power of pre-trained generative models to address extractive QA tasks by generating indexes corresponding to context tokens or sentences that form part of the answer. Through comprehensive evaluations on multiple extractive QA datasets, including MultiSpanQA, BioASQ, MASHQA, and WikiQA, we demonstrate the superior performance of our proposed approach compared to existing state-of-the-art models.
翻译:预训练生成模型(如BART、T5等)已成为各类自然语言处理任务(包括抽象式长文本问答和摘要生成)中首选的文本生成方法。然而,在通常采用判别模型的抽取式问答任务中,生成模型的潜力仍未得到充分探索。判别模型常面临标签稀疏性带来的挑战,尤其是在仅有少量上下文包含答案时。对于多跨度答案,这一挑战更为显著。在本研究中,我们提出了一种新颖的方法,利用预训练生成模型的能力,通过生成对应构成答案的上下文标记或句子的索引来解决抽取式问答任务。通过在多个抽取式问答数据集(包括MultiSpanQA、BioASQ、MASHQA和WikiQA)上的全面评估,我们证明了所提方法相较于现有最先进模型的优越性能。