Cardinality Estimation over Knowledge Graphs (KG) is crucial for query optimization, yet remains a challenging task due to the semi-structured nature and complex correlations of typical Knowledge Graphs. In this work, we propose GNCE, a novel approach that leverages knowledge graph embeddings and Graph Neural Networks (GNN) to accurately predict the cardinality of conjunctive queries. GNCE first creates semantically meaningful embeddings for all entities in the KG, which are then integrated into the given query, which is processed by a GNN to estimate the cardinality of the query. We evaluate GNCE on several KGs in terms of q-Error and demonstrate that it outperforms state-of-the-art approaches based on sampling, summaries, and (machine) learning in terms of estimation accuracy while also having lower execution time and less parameters. Additionally, we show that GNCE can inductively generalise to unseen entities, making it suitable for use in dynamic query processing scenarios. Our proposed approach has the potential to significantly improve query optimization and related applications that rely on accurate cardinality estimates of conjunctive queries.
翻译:知识图谱(KG)上的基数估计对于查询优化至关重要,但由于典型知识图谱的半结构化性质和复杂关联性,这仍然是一项具有挑战性的任务。在本工作中,我们提出了一种新颖的方法GNCE,该方法利用知识图谱嵌入和图神经网络(GNN)来准确预测合取查询的基数。GNCE首先为知识图谱中的所有实体创建具有语义意义的嵌入,然后将这些嵌入整合到给定的查询中,该查询由一个GNN处理以估计查询的基数。我们在多个知识图谱上根据q-误差评估了GNCE,并证明其在估计精度方面优于基于采样、摘要和(机器)学习的最先进方法,同时具有更低的执行时间和更少的参数。此外,我们展示了GNCE能够归纳推广到未见过的实体,使其适用于动态查询处理场景。我们提出的方法有潜力显著改善查询优化以及依赖合取查询准确基数估计的相关应用。