The submodular width is a complexity measure of conjunctive queries (CQs), which assigns a nonnegative real number, subw(Q), to each CQ Q. An existing algorithm, called PAND, performs CQ evaluation in polynomial time where the exponent is essentially subw(Q). Formally, for every Boolean CQ Q, PANDA evaluates Q in time $O(N^{\mathsf{subw}(Q)} \cdot \mathsf{polylog}(N))$, where N denotes the input size; moreover, there is complexity-theoretic evidence that, for a number of Boolean CQs, no exponent strictly below subw(Q) can be achieved by combinatorial algorithms. On a high level, the submodular width of a CQ Q can be described as the maximum over all polymatroids, which are set functions on the variables of Q that satisfy Shannon inequalities. The PANDA algorithm in a sense works in the dual space of this maximization problem, makes use of information theory, and transforms a CQ into a set of disjunctive datalog programs which are individually solved. In this article, we introduce a new algorithm for CQ evaluation which achieves, for each Boolean CQ Q and for all epsilon > 0, a running time of $O(N^{\mathsf{subw}(Q)+ε})$. This new algorithm's description and analysis are, in our view, significantly simpler than those of PANDA. We refer to it as a "primal" algorithm as it operates in the primal space of the described maximization problem, by maintaining a feasible primal solution, namely, a polymatroid. Indeed, this algorithm deals directly with the input CQ and adaptively computes a sequence of joins, in a guided fashion, so that the cost of these join computations is bounded. Additionally, this algorithm can achieve the stated runtime for the generalization of the submodular width incorporating degree constraints. We dub our algorithm Jaguar, as it is a join-adaptive guided algorithm.
翻译:子模宽度是合取查询(CQ)的一种复杂度度量,它为每个合取查询Q赋予一个非负实数subw(Q)。现有算法PANDA可在多项式时间内完成合取查询评估,其指数本质上等于subw(Q)。具体而言,对于每个布尔合取查询Q,PANDA在时间$O(N^{\mathsf{subw}(Q)} \cdot \mathsf{polylog}(N))$内完成评估(其中N表示输入规模);且复杂性理论证据表明,对于若干布尔合取查询,组合算法无法实现严格低于subw(Q)的指数。从高层视角看,合取查询Q的子模宽度可描述为所有满足香农不等式的多拟阵(定义在Q变量上的集合函数)的最大值。PANDA算法在某种意义上工作于该最大化问题的对偶空间,借助信息论将合取查询转化为一组可分别求解的析取数据日志程序。本文提出一种新的合取查询评估算法,对于每个布尔合取查询Q及任意ε>0,其运行时间可达$O(N^{\mathsf{subw}(Q)+ε})$。我们认为,该新算法的描述与分析均显著优于PANDA。我们将其称为"原始"算法,因为它通过维护一个可行原始解(即多拟阵)而工作于上述最大化问题的原始空间。事实上,该算法直接处理输入合取查询,以引导方式自适应计算一系列连接操作,从而限制这些连接计算的开销。此外,该算法可在引入度约束的子模宽度推广场景下实现所述运行时间。我们将此算法命名为Jaguar,即连接自适应引导算法。