This paper studies speculative reasoning task on real-world knowledge graphs (KG) that contain both \textit{false negative issue} (i.e., potential true facts being excluded) and \textit{false positive issue} (i.e., unreliable or outdated facts being included). State-of-the-art methods fall short in the speculative reasoning ability, as they assume the correctness of a fact is solely determined by its presence in KG, making them vulnerable to false negative/positive issues. The new reasoning task is formulated as a noisy Positive-Unlabeled learning problem. We propose a variational framework, namely nPUGraph, that jointly estimates the correctness of both collected and uncollected facts (which we call \textit{label posterior}) and updates model parameters during training. The label posterior estimation facilitates speculative reasoning from two perspectives. First, it improves the robustness of a label posterior-aware graph encoder against false positive links. Second, it identifies missing facts to provide high-quality grounds of reasoning. They are unified in a simple yet effective self-training procedure. Empirically, extensive experiments on three benchmark KG and one Twitter dataset with various degrees of false negative/positive cases demonstrate the effectiveness of nPUGraph.
翻译:本文研究真实世界知识图谱(KG)中的推测性推理任务,该类图谱同时存在**假阴性问题**(即潜在真实事实被遗漏)和**假阳性问题**(即不可靠或过时事实被包含)。现有最先进方法在推测性推理能力上存在不足,因为它们假设事实的正确性仅由其是否存在于知识图谱中决定,这使得它们容易受到假阴性/假阳性问题的影响。我们将这一新型推理任务建模为一个有噪正-无标签学习问题。我们提出一个变分框架nPUGraph,该框架在训练过程中联合估计已收集事实和未收集事实的正确性(我们称之为**标签后验**),并同步更新模型参数。标签后验估计从两个角度促进推测性推理:首先,它提升了标签后验感知图编码器对假阳性链接的鲁棒性;其次,它识别缺失事实以提供高质量推理依据。两者通过一个简单而有效的自训练过程得到统一。实证方面,在三个基准知识图谱和一个推特数据集上进行的广泛实验(涵盖不同程度假阴性/假阳性案例)充分证明了nPUGraph的有效性。