Question Answering (QA) is the task of automatically answering questions posed by humans in natural languages. There are different settings to answer a question, such as abstractive, extractive, boolean, and multiple-choice QA. As a popular topic in natural language processing tasks, extractive question answering task (extractive QA) has gained extensive attention in the past few years. With the continuous evolvement of the world, generalized cross-lingual transfer (G-XLT), where question and answer context are in different languages, poses some unique challenges over cross-lingual transfer (XLT), where question and answer context are in the same language. With the boost of corresponding development of related benchmarks, many works have been done to improve the performance of various language QA tasks. However, only a few works are dedicated to the G-XLT task. In this work, we propose a generalized cross-lingual transfer framework to enhance the model's ability to understand different languages. Specifically, we first assemble triples from different languages to form multilingual knowledge. Since the lack of knowledge between different languages greatly limits models' reasoning ability, we further design a knowledge injection strategy via leveraging link prediction techniques to enrich the model storage of multilingual knowledge. In this way, we can profoundly exploit rich semantic knowledge. Experiment results on real-world datasets MLQA demonstrate that the proposed method can improve the performance by a large margin, outperforming the baseline method by 13.18%/12.00% F1/EM on average.
翻译:问答任务是自动回答人类用自然语言提出的问题。根据不同的回答场景,可分为生成式、抽取式、布尔型及多项选择式问答等类型。作为自然语言处理任务中的热门课题,抽取式问答任务在过去几年间获得了广泛关注。随着世界的持续演进,在问题与答案上下文分属不同语言的广义跨语言迁移场景中,存在一些区别于问题与答案上下文同属同一语言的跨语言迁移场景的特殊挑战。得益于相关基准数据集的快速发展,已有诸多研究致力于提升各类语言问答任务的性能。然而,仅有少量工作专注于广义跨语言迁移任务。本研究提出一种广义跨语言迁移框架,旨在增强模型理解不同语言的能力。具体而言,我们首先从不同语言中抽取三元组以构建多语言知识。鉴于不同语言间知识的匮乏严重制约了模型的推理能力,我们进一步设计了一种利用链接预测技术注入知识的策略,以丰富模型存储的多语言知识。通过这种方式,我们能够深度挖掘丰富的语义知识。在真实数据集MLQA上的实验结果表明,所提方法可大幅提升性能,平均F1/EM值较基线方法分别提升13.18%/12.00%。