Over the last decade, worst-case optimal join (WCOJ) algorithms have emerged as a new paradigm for one of the most fundamental challenges in query processing: computing joins efficiently. Such an algorithm can be asymptotically faster than traditional binary joins, all the while remaining simple to understand and implement. However, they have been found to be less efficient than the old paradigm, traditional binary join plans, on the typical acyclic queries found in practice. Some database systems that support WCOJ use a hypbrid approach: use WCOJ to process the cyclic subparts of the query (if any), and rely on traditional binary joins otherwise. In this paper we propose a new framework, called Free Join, that unifies the two paradigms. We describe a new type of plan, a new data structure (which unifies the hash tables and tries used by the two paradigms), and a suite of optimization techniques. Our system, implemented in Rust, matches or outperforms both traditional binary joins and Generic Join on standard query benchmarks.
翻译:在过去十年中,最坏情况最优连接算法已成为查询处理中最基本挑战之一(即高效计算连接)的新范式。这类算法在渐近速度上优于传统二元连接,同时保持易于理解和实现的特性。然而,在实际常见的无环查询中,其效率低于传统范式——即基于二元连接计划的查询方案。部分支持最坏情况最优连接的数据库系统采用混合方法:使用最坏情况最优连接处理查询中的有环子部分(如有),其他部分则依赖传统二元连接。本文提出一种名为"自由连接"的新框架,该框架统一了两种范式。我们描述了一种新型计划、一种新数据结构(统一了两种范式所使用的哈希表与字典树),以及一系列优化技术。基于Rust实现的系统在标准查询基准测试中达到或超越了传统二元连接与通用连接的性能。