Knowledge graph (KG) foundation models (KGFMs) are zero-shot generalizers: trained once, they can predict links on unseen graphs without retraining. However, understanding when and how they can robustly generalize across KGs is still an open question. In this paper, we shed some light on their generalization mechanisms highlighting how their performance on unseen KGs is not uniform when it comes to partially seen links, which we call half-links. In fact, we show that to predict a test triple $(h,r,t)$ it might suffice in practice to have observed the half-link $(h,r)$ or $(r,t)$ in the inference graph. This yields a taxonomy of four scenarios when combinations of these half-links are observed or not. In a rigorous stratified analysis over these scenarios, we reveal that SoTA KGFMs use seen half links for predictions, while unseen half-links pose different challenges. As such, our finer-grained taxonomy can be a diagnostic protocol for robust KGFM generalization and highlights where novel KGFMs can improve.
翻译:知识图谱基础模型(KGFMs)是零样本泛化器:经过一次训练,它们无需重新训练即可预测未见图谱上的链接。然而,理解它们何时以及如何在知识图谱间稳健泛化仍是一个悬而未决的问题。本文揭示了其泛化机制的部分原理,强调了它们在处理部分可见链接(我们称之为半链接)时,在未见知识图谱上的表现并不一致。事实上,我们证明要预测测试三元组$(h,r,t)$,在实践中只要观察到推理图谱中的半链接$(h,r)$或$(r,t)$就足够了。这产生了四种场景的分类法,这些场景组合了这些半链接是否被观测到。通过对这些场景进行严格的分层分析,我们揭露了当前最先进的KGFMs会利用已见的半链接进行预测,而未见的半链接则带来不同挑战。因此,我们更细粒度的分类法可作为KGFMs稳健泛化的诊断协议,并指明了新型KGFMs可以改进的方向。