This paper investigates cross-lingual temporal knowledge graph reasoning problem, which aims to facilitate reasoning on Temporal Knowledge Graphs (TKGs) in low-resource languages by transfering knowledge from TKGs in high-resource ones. The cross-lingual distillation ability across TKGs becomes increasingly crucial, in light of the unsatisfying performance of existing reasoning methods on those severely incomplete TKGs, especially in low-resource languages. However, it poses tremendous challenges in two aspects. First, the cross-lingual alignments, which serve as bridges for knowledge transfer, are usually too scarce to transfer sufficient knowledge between two TKGs. Second, temporal knowledge discrepancy of the aligned entities, especially when alignments are unreliable, can mislead the knowledge distillation process. We correspondingly propose a mutually-paced knowledge distillation model MP-KD, where a teacher network trained on a source TKG can guide the training of a student network on target TKGs with an alignment module. Concretely, to deal with the scarcity issue, MP-KD generates pseudo alignments between TKGs based on the temporal information extracted by our representation module. To maximize the efficacy of knowledge transfer and control the noise caused by the temporal knowledge discrepancy, we enhance MP-KD with a temporal cross-lingual attention mechanism to dynamically estimate the alignment strength. The two procedures are mutually paced along with model training. Extensive experiments on twelve cross-lingual TKG transfer tasks in the EventKG benchmark demonstrate the effectiveness of the proposed MP-KD method.
翻译:本文研究跨语言时序知识图谱推理问题,其目标是通过从高资源语言时序知识图谱(Temporal Knowledge Graphs, TKGs)迁移知识,促进低资源语言TKGs上的推理。鉴于现有推理方法在严重不完整的TKGs(尤其是低资源语言)上表现欠佳,跨语言TKGs间的蒸馏能力变得愈发关键。然而,该问题面临两大挑战:首先,作为知识迁移桥梁的跨语言对齐通常过于稀疏,难以在两个TKGs间传递足够知识;其次,对齐实体间的时序知识差异(尤其在对齐不可靠时)可能误导知识蒸馏过程。为此,我们提出互促知识蒸馏模型MP-KD,其中在源TKG上训练的教师网络可通过对齐模块指导目标TKG上学生网络的训练。具体而言,为应对稀疏性问题,MP-KD基于表征模块提取的时序信息生成TKG间的伪对齐;为最大化知识迁移效率并控制时序知识差异导致的噪声,我们通过时序跨语言注意力机制增强MP-KD,动态估计对齐强度。这两个过程随模型训练互促推进。在EventKG基准上针对十二个跨语言TKG迁移任务的广泛实验表明,所提MP-KD方法具有有效性。