Join evaluation is one of the most fundamental operations performed by database systems and arguably the most well-studied problem in the Database community. A staggering number of join algorithms have been developed, and commercial database engines use finely tuned join heuristics that take into account many factors including the selectivity of predicates, memory, IO, etc. However, most of the results have catered to either full join queries or non-full join queries but with degree constraints (such as PK-FK relationships) that make joins \emph{easier} to evaluate. Further, most of the algorithms are also not output-sensitive. In this paper, we present a novel, output-sensitive algorithm for the evaluation of acyclic Conjunctive Queries (CQs) that contain arbitrary free variables. Our result is based on a novel generalization of the Yannakakis algorithm and shows that it is possible to improve the running time guarantee of the Yannakakis algorithm by a polynomial factor. Importantly, our algorithmic improvement does not depend on the use of fast matrix multiplication, as a recently proposed algorithm does. The upper bound is complemented with matching lower bounds conditioned on two variants of the $k$-clique conjecture. The application of our algorithm recovers known prior results and improves on known state-of-the-art results for common queries such as paths and stars.
翻译:连接评估是数据库系统执行的最基本操作之一,也是数据库领域研究最深入的问题。大量连接算法已被开发出来,商业数据库引擎使用经过精细调整的连接启发式策略,综合考虑谓词选择性、内存、I/O等多种因素。然而,大多数研究成果要么针对完全连接查询,要么针对具有度数约束(如主键-外键关系)的非完全连接查询——这些约束使连接评估变得"更容易"。此外,大多数算法也不具备输出敏感性。本文提出一种新颖的输出敏感算法,用于评估包含任意自由变量的无环合取查询(CQ)。我们的成果基于对Yannakakis算法的一种创新泛化,表明可将Yannakakis算法的运行时间保证提高一个多项式因子。重要的是,我们的算法改进不依赖于快速矩阵乘法(与近期提出的某算法不同)。该上界与基于$k$-团猜想两种变体的匹配下界互为补充。应用我们的算法可复现已知成果,并改进路径和星型等常见查询的现有最优结果。