We introduce EDB-bounded datalog, a framework for deriving upper bounds on intermediate result sizes and the asymptotic complexity of recursive queries in datalog. We present an algorithm that, given an arbitrary datalog program, constructs an EDB-bounded datalog program in which every rule is adorned with a (non-recursive) conjunctive query that subsumes the result of the rule, thus acting as an upper bound. From such adornments, we define a notion of width based on (integral or fractional) edge-cover widths. Through the adornments and the width measure, we obtain, for every IDB predicate, worst-case upper bounds on their sizes, which are polynomial in the input data size, given a fixed program structure. Furthermore, with these size bounds, we also derive fixed-parameter tractable, output-sensitive asymptotic complexity bounds for evaluating the entire program. Additionally, by adapting our framework, we obtain a semi-decision procedure for datalog boundedness that efficiently rewrites most practical bounded programs into non-recursive equivalent programs.
翻译:我们引入EDB有界Datalog,这是一个用于推导Datalog中递归查询的中间结果大小上界和渐近复杂度的框架。我们提出一种算法,该算法在给定任意Datalog程序时,能构造一个EDB有界Datalog程序,其中每条规则都装饰有一个(非递归的)合取查询,该查询蕴含该规则的结果,从而充当其上界。基于这些装饰,我们定义了一种基于(整数或分数)边覆盖宽度的宽度概念。通过这些装饰和宽度度量,我们为每个IDB谓词获得了其大小在最坏情况下的上界,这些上界在给定固定程序结构的情况下,是输入数据大小的多项式函数。此外,利用这些大小界,我们还推导出用于评估整个程序的、固定参数可处理的、输出敏感的渐近复杂度上界。另外,通过调整我们的框架,我们获得了一个Datalog有界性的半判定过程,该过程能高效地将大多数实际有界程序重写为非递归的等价程序。