Textbook question answering (TQA) is a challenging task in artificial intelligence due to the complex nature of context needed to answer complex questions. Although previous research has improved the task, there are still some limitations in textual TQA, including weak reasoning and inability to capture contextual information in the lengthy context. We propose a framework (PLRTQA) that incorporates the retrieval augmented generation (RAG) technique to handle the out-of-domain scenario where concepts are spread across different lessons, and utilize transfer learning to handle the long context and enhance reasoning abilities. Our architecture outperforms the baseline, achieving an accuracy improvement of 4. 12% in the validation set and 9. 84% in the test set for textual multiple-choice questions. While this paper focuses on solving challenges in the textual TQA, It provides a foundation for future work in multimodal TQA where the visual components are integrated to address more complex educational scenarios. Code: https://github.com/hessaAlawwad/PLR-TQA
翻译:教材问答(TQA)是人工智能领域一项具有挑战性的任务,其难点在于回答复杂问题所需的上下文信息具有高度复杂性。尽管已有研究在该任务上取得了进展,但文本TQA仍存在若干局限,包括推理能力较弱以及难以从冗长上下文中有效提取语境信息。本文提出一种融合检索增强生成(RAG)技术的框架(PLRTQA),该框架通过RAG处理概念分散在不同章节的跨领域场景,并利用迁移学习处理长上下文以增强推理能力。我们的架构在基线模型基础上实现了显著提升,在文本选择题验证集上准确率提高4.12%,在测试集上提高9.84%。本文虽聚焦于解决文本TQA的挑战,但为未来融合视觉组件的多模态TQA研究奠定了基础,以应对更复杂的教育场景。代码地址:https://github.com/hessaAlawwad/PLR-TQA