Commonsense knowledge graph completion is a new challenge for commonsense knowledge graph construction and application. In contrast to factual knowledge graphs such as Freebase and YAGO, commonsense knowledge graphs (CSKGs; e.g., ConceptNet) utilize free-form text to represent named entities, short phrases, and events as their nodes. Such a loose structure results in large and sparse CSKGs, which makes the semantic understanding of these nodes more critical for learning rich commonsense knowledge graph embedding. While current methods leverage semantic similarities to increase the graph density, the semantic plausibility of the nodes and their relations are under-explored. Previous works adopt conceptual abstraction to improve the consistency of modeling (event) plausibility, but they are not scalable enough and still suffer from data sparsity. In this paper, we propose to adopt textual entailment to find implicit entailment relations between CSKG nodes, to effectively densify the subgraph connecting nodes within the same conceptual class, which indicates a similar level of plausibility. Each node in CSKG finds its top entailed nodes using a finetuned transformer over natural language inference (NLI) tasks, which sufficiently capture textual entailment signals. The entailment relation between these nodes are further utilized to: 1) build new connections between source triplets and entailed nodes to densify the sparse CSKGs; 2) enrich the generalization ability of node representations by comparing the node embeddings with a contrastive loss. Experiments on two standard CSKGs demonstrate that our proposed framework EntailE can improve the performance of CSKG completion tasks under both transductive and inductive settings.
翻译:常识知识图谱补全是常识知识图谱构建与应用中的一项新挑战。与Freebase和YAGO等事实性知识图谱不同,常识知识图谱(CSKGs,如ConceptNet)采用自由形式的文本来表示命名实体、短短语和事件作为其节点。这种松散的结构导致CSKGs规模庞大且稀疏,使得对这些节点的语义理解对于学习丰富的常识知识图谱嵌入更为关键。虽然当前方法利用语义相似性来增加图谱密度,但节点及其关系的语义合理性尚未得到充分探索。先前工作采用概念抽象来提升(事件)合理性建模的一致性,但其可扩展性不足且仍受数据稀疏性困扰。本文提出采用文本蕴含来发现CSKG节点间的隐含蕴含关系,从而有效稠密化同一概念类内节点间的子图(该类表示相似的合理性水平)。CSKG中的每个节点通过基于自然语言推理(NLI)任务微调的Transformer模型找到其顶级蕴含节点,该过程能充分捕获文本蕴含信号。这些节点间的蕴含关系进一步用于:1)在源三元组与蕴含节点间建立新连接以稠密化稀疏的CSKGs;2)通过对比损失比较节点嵌入来增强节点表示的泛化能力。在两个标准CSKGs上的实验表明,本文提出的EntailE框架能够在直推式与归纳式两种设定下提升CSKG补全任务的性能。