Flowchart Question Answering (FlowchartQA) is a multi-modal task that automatically answers questions conditioned on graphic flowcharts. Current studies convert flowcharts into interlanguages (e.g., Graphviz) for Question Answering (QA), which effectively bridge modal gaps between questions and flowcharts. More importantly, they reveal the link relations between nodes in the flowchart, facilitating a shallow relation reasoning during tracing answers. However, the existing interlanguages still lose sight of intricate semantic/logic relationships such as Conditional and Causal relations. This hinders the deep reasoning for complex questions. To address the issue, we propose a novel Semantic Relation-Aware (SRA) FlowchartQA approach. It leverages Large Language Model (LLM) to detect the discourse semantic relations between nodes, by which a link-based interlanguage is upgraded to the semantic relation based interlanguage. In addition, we conduct an interlanguage-controllable reasoning process. In this process, the question intention is analyzed with the aim to determine the depth of reasoning (Shallow or Deep reasoning), as well as the well-matched interlanguage. We experiment on the benchmark dataset FlowVQA. The test results show that SRA yields widespread improvements when upgrading different interlanguages like Graphviz, Mermaid and Plantuml
翻译:流程图问答(FlowchartQA)是一项多模态任务,旨在基于图形化流程图自动回答问题。现有研究将流程图转换为中间语言(如Graphviz)以进行问答,这有效弥合了问题与流程图之间的模态鸿沟。更重要的是,这些方法揭示了流程图中节点间的链接关系,有助于在追踪答案时进行浅层关系推理。然而,现有中间语言仍忽略了条件关系、因果关系等复杂的语义/逻辑关系,这阻碍了对复杂问题的深度推理。为解决该问题,我们提出了一种新颖的语义关系感知(SRA)流程图问答方法。该方法利用大语言模型(LLM)检测节点间的话语语义关系,从而将基于链接的中间语言升级为基于语义关系的中间语言。此外,我们实施了中间语言可控的推理过程。在此过程中,通过分析问题意图来确定推理深度(浅层或深层推理)以及最佳匹配的中间语言。我们在基准数据集FlowVQA上进行了实验。测试结果表明,当对Graphviz、Mermaid和Plantuml等不同中间语言进行升级时,SRA方法均能带来广泛的性能提升。