The submodular width is a complexity measure of conjunctive queries (CQs), which assigns a nonnegative real number, subw(Q), to each CQ Q. An existing algorithm, called PAND, performs CQ evaluation in polynomial time where the exponent is essentially subw(Q). Formally, for every Boolean CQ Q, PANDA evaluates Q in time $O(N^{\mathsf{subw}(Q)} \cdot \mathsf{polylog}(N))$, where N denotes the input size; moreover, there is complexity-theoretic evidence that, for a number of Boolean CQs, no exponent strictly below subw(Q) can be achieved by combinatorial algorithms. On a high level, the submodular width of a CQ Q can be described as the maximum over all polymatroids, which are set functions on the variables of Q that satisfy Shannon inequalities. The PANDA algorithm in a sense works in the dual space of this maximization problem, makes use of information theory, and transforms a CQ into a set of disjunctive datalog programs which are individually solved. In this article, we introduce a new algorithm for CQ evaluation which achieves, for each Boolean CQ Q and for all epsilon > 0, a running time of $O(N^{\mathsf{subw}(Q)+ε})$. This new algorithm's description and analysis are, in our view, significantly simpler than those of PANDA. We refer to it as a "primal" algorithm as it operates in the primal space of the described maximization problem, by maintaining a feasible primal solution, namely, a polymatroid. Indeed, this algorithm deals directly with the input CQ and adaptively computes a sequence of joins, in a guided fashion, so that the cost of these join computations is bounded. Additionally, this algorithm can achieve the stated runtime for the generalization of the submodular width incorporating degree constraints. We dub our algorithm Jaguar, as it is a join-adaptive guided algorithm.
翻译:次模宽度是合取查询(CQ)的一种复杂度度量,它为每个合取查询Q分配一个非负实数subw(Q)。现有算法PAND能够在多项式时间内执行CQ评估,其中指数本质上为subw(Q)。形式化地说,对于任意布尔合取查询Q,PANDA算法在$O(N^{\mathsf{subw}(Q)} \cdot \mathsf{polylog}(N))$时间内完成评估,其中N表示输入规模;此外,复杂度理论证据表明,对于许多布尔合取查询,组合算法无法实现严格低于subw(Q)的指数。从高层次看,合取查询Q的次模宽度可描述为在所有多拟阵上的最大值,这些多拟阵是定义在Q变量上且满足香农不等式的集合函数。PANDA算法在某种意义上工作于该最大化问题的对偶空间,利用信息论将合取查询转化为一组可分别求解的析取数据日志程序。本文提出一种新的合取查询评估算法,对于每个布尔合取查询Q及任意ε>0,该算法均可实现$O(N^{\mathsf{subw}(Q)+ε})$的运行时间。我们认为,新算法的描述与分析相较于PANDA显著简化。我们将其称为“原始”算法,因为它运行于所述最大化问题的原始空间,通过维护可行的原始解(即多拟阵)来实现。该算法直接处理输入合取查询,并以引导方式自适应地计算一系列连接操作,确保这些连接计算的开销受控。此外,该算法能够在包含度约束的次模宽度泛化情况下实现所述运行时间。我们将该算法命名为Jaguar,因其是一种连接自适应的引导算法。