Knowledge Graphs (KGs) have seen increasing use across various domains -- from biomedicine and linguistics to general knowledge modelling. In order to facilitate the analysis of knowledge graphs, Knowledge Graph Embeddings (KGEs) have been developed to automatically analyse KGs and predict new facts based on the information in a KG, a task called "link prediction". Many existing studies have documented that the structure of a KG, KGE model components, and KGE hyperparameters can significantly change how well KGEs perform and what relationships they are able to learn. Recently, the Topologically-Weighted Intelligence Generation (TWIG) model has been proposed as a solution to modelling how each of these elements relate. In this work, we extend the previous research on TWIG and evaluate its ability to simulate the output of the KGE model ComplEx in the cross-KG setting. Our results are twofold. First, TWIG is able to summarise KGE performance on a wide range of hyperparameter settings and KGs being learned, suggesting that it represents a general knowledge of how to predict KGE performance from KG structure. Second, we show that TWIG can successfully predict hyperparameter performance on unseen KGs in the zero-shot setting. This second observation leads us to propose that, with additional research, optimal hyperparameter selection for KGE models could be determined in a pre-hoc manner using TWIG-like methods, rather than by using a full hyperparameter search.
翻译:知识图谱(KGs)在生物医学、语言学乃至通用知识建模等众多领域的应用日益广泛。为促进知识图谱的分析,知识图谱嵌入(KGEs)技术应运而生,其能够自动分析知识图谱并基于图谱中的信息预测新事实,这一任务被称为“链接预测”。已有大量研究表明,知识图谱的结构、KGE模型组件及KGE超参数会显著影响KGE的性能表现及其可学习的关系类型。近期提出的拓扑加权智能生成(TWIG)模型为建模这些要素间的关联提供了解决方案。本研究在先前TWIG研究的基础上进行拓展,评估其在跨知识图谱场景下模拟KGE模型ComplEx输出的能力。我们得到两方面结论:首先,TWIG能够对多种超参数设置及不同知识图谱上的KGE性能进行归纳总结,表明其具备从知识图谱结构预测KGE性能的通用知识表征能力;其次,我们证明TWIG在零样本设置下能成功预测未见知识图谱上的超参数性能。基于第二点发现,我们提出通过进一步研究,有望采用TWIG类方法在训练前预先确定KGE模型的最优超参数选择,从而替代传统的全量超参数搜索方法。