Large Language Models (LLMs) have recently made impressive strides in natural language understanding tasks. Despite their remarkable performance, understanding their decision-making process remains a big challenge. In this paper, we look into bringing some transparency to this process by introducing a new explanation dataset for question answering (QA) tasks that integrates knowledge graphs (KGs) in a novel way. Our dataset includes 12,102 question-answer-explanation (QAE) triples. Each explanation in the dataset links the LLM's reasoning to entities and relations in the KGs. The explanation component includes a why-choose explanation, a why-not-choose explanation, and a set of reason-elements that underlie the LLM's decision. We leverage KGs and graph attention networks (GAT) to find the reason-elements and transform them into why-choose and why-not-choose explanations that are comprehensible to humans. Through quantitative and qualitative evaluations, we demonstrate the potential of our dataset to improve the in-context learning of LLMs, and enhance their interpretability and explainability. Our work contributes to the field of explainable AI by enabling a deeper understanding of the LLMs decision-making process to make them more transparent and thereby, potentially more reliable, to researchers and practitioners alike. Our dataset is available at: https://github.com/chen-zichen/XplainLLM_dataset.git
翻译:大语言模型(LLMs)近年来在自然语言理解任务中取得了令人瞩目的进展。尽管性能出色,但理解其决策过程仍是一个重大挑战。本文通过引入一种新颖的知识图谱(KGs)整合方式,构建了面向问答(QA)任务的解释数据集,为提升该过程的透明度做出了探索。该数据集包含12,102个问题-答案-解释(QAE)三元组。每个解释将LLM的推理过程链接到知识图谱中的实体和关系。解释组件包括“为何选择”解释、“为何不选”解释以及一组构成LLM决策基础的推理元素。我们利用知识图谱和图注意力网络(GAT)来定位推理元素,并将其转化为人类可理解的“为何选择”和“为何不选”解释。通过定量与定性评估,我们展示了该数据集在提升LLM上下文学习能力、增强其可解释性与可解释性方面的潜力。我们的工作通过促进对LLM决策过程的深入理解,使其对研究人员和从业者更加透明、进而更具可靠性,从而为可解释人工智能领域做出了贡献。数据集获取地址:https://github.com/chen-zichen/XplainLLM_dataset.git