One fundamental question in database theory is the following: Given a Boolean conjunctive query Q, what is the best complexity for computing the answer to Q in terms of the input database size N? When restricted to the class of combinatorial algorithms, it is known that the best known complexity for any query Q is captured by the submodular width of Q. However, beyond combinatorial algorithms, certain queries are known to admit faster algorithms that often involve a clever combination of fast matrix multiplication and data partitioning. Nevertheless, there is no systematic way to derive and analyze the complexity of such algorithms for arbitrary queries Q. In this work, we introduce a general framework that captures the best complexity for answering any Boolean conjunctive query Q using matrix multiplication. Our framework unifies both combinatorial and non-combinatorial techniques under the umbrella of information theory. It generalizes the notion of submodular width to a new stronger notion called the omega-submodular width that naturally incorporates the power of fast matrix multiplication. We describe a matching algorithm that computes the answer to any query Q in time corresponding to the omega-submodular width of Q. We show that our framework recovers the best known complexities for Boolean queries that have been studied in the literature, to the best of our knowledge, and also discovers new algorithms for some classes of queries that improve upon the best known complexities.
翻译:数据库理论中的一个基本问题是:给定一个布尔连接查询Q,在输入数据库大小N的条件下,计算Q答案的最佳复杂度是什么?当限制在组合算法类时,已知对于任意查询Q的最佳复杂度由Q的子模宽度所刻画。然而,在组合算法之外,某些查询已知存在更快的算法,这些算法通常涉及快速矩阵乘法与数据划分的巧妙结合。尽管如此,对于任意查询Q,目前尚缺乏系统性的方法来推导和分析此类算法的复杂度。在本工作中,我们提出了一个通用框架,用于刻画使用矩阵乘法求解任意布尔连接查询Q的最佳复杂度。我们的框架在信息论的统一框架下融合了组合与非组合技术。它将子模宽度的概念推广至一个更强的新概念——ω-子模宽度,该概念自然地融入了快速矩阵乘法的能力。我们描述了一种匹配算法,该算法能够在与Q的ω-子模宽度对应的时间内计算出任意查询Q的答案。我们证明,据我们所知,该框架能够恢复文献中已研究的布尔查询的最佳已知复杂度,并且为某些查询类发现了改进现有最佳复杂度的新算法。