Knowledge graphs (KGs) comprise entities interconnected by relations of different semantic meanings. KGs are being used in a wide range of applications. However, they inherently suffer from incompleteness, i.e. entities or facts about entities are missing. Consequently, a larger body of works focuses on the completion of missing information in KGs, which is commonly referred to as link prediction (LP). This task has traditionally and extensively been studied in the transductive setting, where all entities and relations in the testing set are observed during training. Recently, several works have tackled the LP task under more challenging settings, where entities and relations in the test set may be unobserved during training, or appear in only a few facts. These works are known as inductive, few-shot, and zero-shot link prediction. In this work, we conduct a systematic review of existing works in this area. A thorough analysis leads us to point out the undesirable existence of diverging terminologies and task definitions for the aforementioned settings, which further limits the possibility of comparison between recent works. We consequently aim at dissecting each setting thoroughly, attempting to reveal its intrinsic characteristics. A unifying nomenclature is ultimately proposed to refer to each of them in a simple and consistent manner.
翻译:知识图谱(KGs)由通过不同语义关系相互连接的实体构成,广泛应用于各类场景。然而,其本质存在不完整性问题,即实体或相关事实存在缺失。因此,大量研究工作聚焦于知识图谱中缺失信息的补全,即链路预测(LP)任务。传统上,该任务主要基于传导设置进行研究,即测试集中的所有实体与关系在训练阶段均可观测。近期,部分研究开始探索更具挑战性的设置,其中测试集实体或关系在训练时未出现,或仅出现于少数事实中——此类工作被称为归纳式、少样本与零样本链路预测。本文对该领域现有研究进行了系统性综述。通过深入分析,我们指出前述设置中存在术语与任务定义不统一的问题,这进一步限制了近期研究成果间的可比性。为此,我们致力于细致剖析每种设置,揭示其内在特征,并最终提出一套统一命名体系,以简洁一致的方式指称各类设置。