Question Answering (QA) is a task that entails reasoning over natural language contexts, and many relevant works augment language models (LMs) with graph neural networks (GNNs) to encode the Knowledge Graph (KG) information. However, most existing GNN-based modules for QA do not take advantage of rich relational information of KGs and depend on limited information interaction between the LM and the KG. To address these issues, we propose Question Answering Transformer (QAT), which is designed to jointly reason over language and graphs with respect to entity relations in a unified manner. Specifically, QAT constructs Meta-Path tokens, which learn relation-centric embeddings based on diverse structural and semantic relations. Then, our Relation-Aware Self-Attention module comprehensively integrates different modalities via the Cross-Modal Relative Position Bias, which guides information exchange between relevant entites of different modalities. We validate the effectiveness of QAT on commonsense question answering datasets like CommonsenseQA and OpenBookQA, and on a medical question answering dataset, MedQA-USMLE. On all the datasets, our method achieves state-of-the-art performance. Our code is available at http://github.com/mlvlab/QAT.
翻译:问答(QA)是一项需要对自然语言上下文进行推理的任务,许多相关工作通过将图神经网络(GNN)与语言模型(LM)结合,以编码知识图谱(KG)信息。然而,现有大多数基于GNN的问答模块未能充分利用KG中丰富的关联信息,且依赖LM与KG之间有限的信息交互。为解决这些问题,我们提出了问答Transformer(QAT),该模型旨在以统一的方式对语言和图进行联合推理,重点关注实体间的关系。具体而言,QAT构建了元路径令牌,这些令牌基于多样的结构和语义关系学习以关系为中心的嵌入。随后,我们的关系感知自注意力模块通过跨模态相对位置偏差,全面整合不同模态信息,并引导不同模态中相关实体间的信息交换。我们在常识问答数据集(如CommonsenseQA和OpenBookQA)以及医学问答数据集MedQA-USMLE上验证了QAT的有效性。在所有数据集上,我们的方法均达到了最先进的性能。我们的代码已开源在http://github.com/mlvlab/QAT。