Enhancing Knowledge Graph Embedding Models with Semantic-driven Loss Functions

Knowledge graph embedding models (KGEMs) are used for various tasks related to knowledge graphs (KGs), including link prediction. They are trained with loss functions that are computed considering a batch of scored triples and their corresponding labels. Traditional approaches consider the label of a triple to be either true or false. However, recent works suggest that all negative triples should not be valued equally. In line with this commonly adopted assumption, we posit that semantically valid negative triples might be high-quality negative triples. As such, loss functions should treat them differently from semantically invalid negative ones. To this aim, we propose semantic-driven versions for the three mostly used loss functions for link prediction. In particular, we treat the scores of negative triples differently by injecting background knowledge about relation domains and ranges into the loss functions. In an extensive and controlled experimental setting, we show that the proposed loss functions systematically provide satisfying results on three public benchmark KGs underpinned with different schemas, which demonstrates both the generality and superiority of our proposed approach. In fact, the proposed loss functions do not only lead to better MRR and Hits@10 values, but also drive KGEMs towards better semantic awareness. This highlights that semantic information globally improves KGEMs, and thus should be incorporated into loss functions whenever such information is available.

翻译：知识图谱嵌入模型（KGEMs）被用于知识图谱（KGs）相关的各类任务，包括链接预测。它们通过损失函数进行训练，该损失函数基于一批带有评分的三元组及其对应标签计算得出。传统方法将三元组的标签视为真或假。然而，近期研究表明，所有负例三元组不应被同等看待。基于这一普遍假设，我们提出语义有效的负例三元组可能是高质量的负例三元组。因此，损失函数应对其与语义无效的负例三元组区别对待。为此，我们针对链接预测中最常用的三种损失函数提出了语义驱动版本。具体而言，我们通过将关系域和范围等背景知识注入损失函数，对负例三元组的评分进行差异化处理。在广泛且受控的实验环境中，我们证明了所提出的损失函数在三个基于不同模式的公开基准知识图谱上系统性地取得了令人满意的结果，这体现了我们方法的通用性和优越性。事实上，所提出的损失函数不仅提升了MRR和Hits@10值，还驱动KGEMs向更好的语义感知方向发展。这突显了语义信息能够整体提升KGEMs的性能，因此只要语义信息可用，就应将其纳入损失函数中。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

机器学习损失函数概述，Loss Functions in Machine Learning

专知会员服务

84+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日