Cardinality Estimation over Knowledge Graphs (KG) is crucial for query optimization, yet remains a challenging task due to the semi-structured nature and complex correlations of typical Knowledge Graphs. In this work, we propose GNCE, a novel approach that leverages knowledge graph embeddings and Graph Neural Networks (GNN) to accurately predict the cardinality of conjunctive queries. GNCE first creates semantically meaningful embeddings for all entities in the KG, which are then integrated into the given query, which is processed by a GNN to estimate the cardinality of the query. We evaluate GNCE on several KGs in terms of q-Error and demonstrate that it outperforms state-of-the-art approaches based on sampling, summaries, and (machine) learning in terms of estimation accuracy while also having lower execution time and less parameters. Additionally, we show that GNCE can inductively generalise to unseen entities, making it suitable for use in dynamic query processing scenarios. Our proposed approach has the potential to significantly improve query optimization and related applications that rely on accurate cardinality estimates of conjunctive queries.
翻译:知识图谱上的基数估计对于查询优化至关重要,但由于典型知识图谱的半结构化性质及复杂关联性,这仍是一项具有挑战性的任务。本文提出GNCE这一新方法,该方法利用知识图谱嵌入和图神经网络(GNN)准确预测合取查询的基数。GNCE首先为知识图谱中的所有实体创建具有语义意义的嵌入,然后将这些嵌入集成到给定查询中,该查询由GNN处理以估计查询的基数。我们通过q-Error指标在多个知识图谱上评估GNCE,并证明其在估计精度上优于基于采样、摘要和(机器)学习的最新方法,同时具有更低的执行时间和更少的参数。此外,我们表明GNCE能够归纳性地泛化至未见实体,使其适用于动态查询处理场景。我们提出的方法有望显著改善依赖精确合取查询基数估计的查询优化及相关应用。