Variable independence and decomposability are algorithmic techniques for simplifying logical formulas by tearing apart connections between free variables. These techniques were originally proposed to speed up query evaluation in constraint databases, in particular by representing the query as a Boolean combination of formulas with no interconnected variables. They also have many other applications in SMT, string analysis, databases, automata theory and other areas. However, the precise complexity of variable independence and decomposability has been left open especially for the quantifier-free theory of linear real arithmetic (LRA), which is central in database applications. We introduce a novel characterization of formulas admitting decompositions and use it to show that it is coNP-complete to decide variable decomposability over LRA. As a corollary, we obtain that deciding variable independence is in $ \Sigma_2^p $. These results substantially improve the best known double-exponential time algorithms for variable decomposability and independence. In many practical applications, it is crucial to be able to efficiently eliminate connections between variables whenever possible. We design and implement an algorithm for this problem, which is optimal in theory, exponentially faster compared to the current state-of-the-art algorithm and efficient on various microbenchmarks. In particular, our algorithm is the first one to overcome a fundamental barrier between non-discrete and discrete first-order theories. Formulas arising in practice often have few or even no free variables that are perfectly independent. In this case, our algorithm can compute a best-possible approximation of a decomposition, which can be used to optimize database queries by exploiting partial variable independence, which is present in almost every logical formula or database query constraint.
翻译:变量独立性与可分解性是算法技术,通过切断自由变量之间的关联来简化逻辑公式。这些技术最初旨在加速约束数据库中的查询评估,特别是通过将查询表示为无相互关联变量的公式的布尔组合。它们在SMT、字符串分析、数据库、自动机理论及其他领域也有众多应用。然而,变量独立性与可分解性的精确复杂度问题至今尚未解决,尤其是在数据库应用中核心的无限定词线性实数算术(LRA)理论中。我们提出了一种新的公式特征刻画方法,用以识别可分解公式,并据此证明在LRA中判定变量可分解性是coNP完全的。作为推论,我们得到判定变量独立性的复杂度属于$\Sigma_2^p$。这些结果显著改进了此前关于变量可分解性与独立性已知的最优双指数时间算法。在许多实际应用中,高效消除变量间可能存在的关联至关重要。我们设计并实现了一种针对该问题的算法,该算法在理论上达到最优,相比于当前最先进算法呈指数级加速,并在多种微基准测试中表现高效。特别地,该算法首次突破了非离散与离散一阶理论之间的根本性障碍。实际出现的公式通常具有少量甚至无完全独立的自由变量。针对这种情况,我们的算法可计算最优近似分解,通过利用几乎每个逻辑公式或数据库查询约束中存在的部分变量独立性来优化数据库查询。