The Open-Domain Question Answering (ODQA) task involves retrieving and subsequently generating answers from fine-grained relevant passages within a database. Current systems leverage Pretrained Language Models (PLMs) to model the relationship between questions and passages. However, the diversity in surface form expressions can hinder the model's ability to capture accurate correlations, especially within complex contexts. Therefore, we utilize Abstract Meaning Representation (AMR) graphs to assist the model in understanding complex semantic information. We introduce a method known as Graph-as-Token (GST) to incorporate AMRs into PLMs. Results from Natural Questions (NQ) and TriviaQA (TQ) demonstrate that our GST method can significantly improve performance, resulting in up to 2.44/3.17 Exact Match score improvements on NQ/TQ respectively. Furthermore, our method enhances robustness and outperforms alternative Graph Neural Network (GNN) methods for integrating AMRs. To the best of our knowledge, we are the first to employ semantic graphs in ODQA.
翻译:开放域问答任务涉及从数据库中的细粒度相关段落中检索并生成答案。当前系统借助预训练语言模型(PLMs)对问题与段落间的关系进行建模。然而,表面形式表达的多样性会阻碍模型捕捉准确的相关性,尤其是在复杂语境中。为此,我们利用抽象语义表示(AMR)图辅助模型理解复杂语义信息。我们提出一种名为"图即标记"(GST)的方法,将AMR融入预训练语言模型。在自然问题(NQ)与TriviaQA(TQ)数据集上的实验表明,GST方法能显著提升性能,在NQ和TQ上分别带来高达2.44/3.17的精确匹配分数提升。此外,该方法增强了模型鲁棒性,且优于用于集成AMR的替代图神经网络(GNN)方法。据我们所知,这是首次在开放域问答中应用语义图的尝试。