Relational Graph Convolutional Networks Do Not Learn Sound Rules

from arxiv, Full version (with appendices) of paper accepted to KR 2024 (21st International Conference on Principles of Knowledge Representation and Reasoning)

Graph neural networks (GNNs) are frequently used to predict missing facts in knowledge graphs (KGs). Motivated by the lack of explainability for the outputs of these models, recent work has aimed to explain their predictions using Datalog, a widely used logic-based formalism. However, such work has been restricted to certain subclasses of GNNs. In this paper, we consider one of the most popular GNN architectures for KGs, R-GCN, and we provide two methods to extract rules that explain its predictions and are sound, in the sense that each fact derived by the rules is also predicted by the GNN, for any input dataset. Furthermore, we provide a method that can verify that certain classes of Datalog rules are not sound for the R-GCN. In our experiments, we train R-GCNs on KG completion benchmarks, and we are able to verify that no Datalog rule is sound for these models, even though the models often obtain high to near-perfect accuracy. This raises some concerns about the ability of R-GCN models to generalise and about the explainability of their predictions. We further provide two variations to the training paradigm of R-GCN that encourage it to learn sound rules and find a trade-off between model accuracy and the number of learned sound rules.

翻译：图神经网络（GNN）常被用于预测知识图谱（KG）中的缺失事实。由于这类模型的输出缺乏可解释性，近期研究尝试使用广泛应用的基于逻辑的形式化语言Datalog来解释其预测。然而，此类研究仅限于GNN的特定子类。本文针对知识图谱中最流行的GNN架构之一——R-GCN，提出了两种提取规则的方法，这些规则既能解释其预测，又具有可靠性，即对于任何输入数据集，由规则推导出的每个事实均能被GNN预测。此外，我们提供了一种方法，可验证特定类别的Datalog规则对R-GCN不具有可靠性。在实验中，我们在知识图谱补全基准上训练R-GCN，并验证了这些模型不存在可靠的Datalog规则，尽管模型通常能获得高至接近完美的准确率。这一发现引发了对R-GCN模型泛化能力及其预测可解释性的担忧。我们进一步提出了两种改进R-GCN训练范式的方法，以促使其学习可靠规则，并在模型准确率与所学可靠规则数量之间实现平衡。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日