Learning transferable representation of knowledge graphs (KGs) is challenging due to the heterogeneous, multi-relational nature of graph structures. Inspired by Transformer-based pretrained language models' success on learning transferable representation for texts, we introduce a novel inductive KG representation model (iHT) for KG completion by large-scale pre-training. iHT consists of a entity encoder (e.g., BERT) and a neighbor-aware relational scoring function both parameterized by Transformers. We first pre-train iHT on a large KG dataset, Wikidata5M. Our approach achieves new state-of-the-art results on matched evaluations, with a relative improvement of more than 25% in mean reciprocal rank over previous SOTA models. When further fine-tuned on smaller KGs with either entity and relational shifts, pre-trained iHT representations are shown to be transferable, significantly improving the performance on FB15K-237 and WN18RR.
翻译:知识图谱(KG)的迁移表征学习因图结构异质性和多关系特性而面临挑战。受基于Transformer的预训练语言模型在文本迁移表征学习成功应用的启发,我们提出一种新颖的归纳式KG表征模型(iHT),通过大规模预训练实现知识图谱补全任务。iHT由实体编码器(如BERT)和邻居感知关系评分函数组成,两者均采用Transformer参数化。首先在大型KG数据集Wikidata5M上对iHT进行预训练,该方法在匹配评估中取得了最新最优结果,平均倒数排名相对先前最优模型提升超过25%。当进一步针对实体偏移和关系偏移的小型KG进行微调时,预训练的iHT表征展现出可迁移性,显著提升了FB15K-237和WN18RR数据集上的性能。