Leveraging Pre-trained Language Models for Time Interval Prediction in Text-Enhanced Temporal Knowledge Graphs

Most knowledge graph completion (KGC) methods learn latent representations of entities and relations of a given graph by mapping them into a vector space. Although the majority of these methods focus on static knowledge graphs, a large number of publicly available KGs contain temporal information stating the time instant/period over which a certain fact has been true. Such graphs are often known as temporal knowledge graphs. Furthermore, knowledge graphs may also contain textual descriptions of entities and relations. Both temporal information and textual descriptions are not taken into account during representation learning by static KGC methods, and only structural information of the graph is leveraged. Recently, some studies have used temporal information to improve link prediction, yet they do not exploit textual descriptions and do not support inductive inference (prediction on entities that have not been seen in training). We propose a novel framework called TEMT that exploits the power of pre-trained language models (PLMs) for text-enhanced temporal knowledge graph completion. The knowledge stored in the parameters of a PLM allows TEMT to produce rich semantic representations of facts and to generalize on previously unseen entities. TEMT leverages textual and temporal information available in a KG, treats them separately, and fuses them to get plausibility scores of facts. Unlike previous approaches, TEMT effectively captures dependencies across different time points and enables predictions on unseen entities. To assess the performance of TEMT, we carried out several experiments including time interval prediction, both in transductive and inductive settings, and triple classification. The experimental results show that TEMT is competitive with the state-of-the-art.

翻译：大多数知识图谱补全（KGC）方法通过将给定图谱中的实体和关系映射到向量空间来学习其潜在表示。尽管这些方法大多聚焦于静态知识图谱，但大量公开可用的知识图谱包含时间信息，用以描述某个事实成立的时间点/时间段。这类图谱通常被称为时态知识图谱。此外，知识图谱还可能包含实体和关系的文本描述。静态KGC方法在表示学习过程中既未考虑时间信息，也未利用文本描述，而仅依赖图谱的结构信息。近年来，部分研究利用时间信息改进了链接预测，但这些方法既未利用文本描述，也不支持归纳推理（即对训练中未出现的实体进行预测）。我们提出了一种名为TEMT的新框架，该框架利用预训练语言模型（PLM）的强大能力，实现文本增强时态知识图谱补全。PLM参数中存储的知识使TEMT能够生成事实的丰富语义表示，并泛化至先前未见过的实体。TEMT利用知识图谱中的文本和时间信息，对两者分别处理，并将其融合以获取事实的合理性评分。与以往方法不同，TEMT有效捕捉了不同时间点之间的依赖关系，并实现了对未见实体的预测。为评估TEMT的性能，我们在直推式和归纳式设置下开展了多项实验，包括时间区间预测和三元组分类。实验结果表明，TEMT与当前最先进方法具有竞争力。