We consider the task of lexicographic direct access to query answers. That is, we want to simulate an array containing the answers of a join query sorted in a lexicographic order chosen by the user. A recent dichotomy showed for which queries and orders this task can be done in polylogarithmic access time after quasilinear preprocessing, but this dichotomy does not tell us how much time is required in the cases classified as hard. We determine the preprocessing time needed to achieve polylogarithmic access time for all join queries and all lexicographical orders. To this end, we propose a decomposition-based general algorithm for direct access on join queries. We then explore its optimality by proving lower bounds for the preprocessing time based on the hardness of a certain online Set-Disjointness problem, which shows that our algorithm's bounds are tight for all lexicographic orders on join queries. Then, we prove the hardness of Set-Disjointness based on the Zero-Clique Conjecture which is an established conjecture from fine-grained complexity theory. Interestingly, while proving our lower bound, we show that self-joins do not affect the complexity of direct access (up to logarithmic factors). Our algorithm can also be used to solve queries with projections and relaxed order requirements, though in these cases, its running time not necessarily optimal. We also show that similar techniques to those used in our lower bounds can be used to prove that, for enumerating answers to Loomis-Whitney joins, it is not possible to significantly improve upon trivially computing all answers at preprocessing. This, in turn, gives further evidence (based on the Zero-Clique Conjecture) to the enumeration hardness of self-join free cyclic joins with respect to linear preprocessing and constant delay.
翻译:我们研究连接查询结果按用户选择的字典序排序后,以直接访问方式获取其中元素的任务。近期的一项二分性研究揭示了哪些查询与排序方式可以在拟线性预处理后实现多对数时间的访问,但该二分性并未指出被归类为困难的情形所需的具体时间。我们确定了所有连接查询及所有字典序实现多对数访问时间所需的预处理时间。为此,我们提出一种基于分解的通用算法,用于连接查询的直接访问。随后,我们通过基于在线集合不相交问题难解性的预处理时间下界证明,探讨了该算法的最优性,结果表明我们的算法对于连接查询的所有字典序都是紧致的。进而,我们基于细粒度复杂性理论中的既定猜想——零团猜想,证明了集合不相交问题的难解性。值得注意的是,在下界证明过程中,我们发现自连接并不影响直接访问的复杂度(在对数因子范围内)。我们的算法也可用于处理含投影和宽松排序要求的查询,尽管在这些情况下其运行时间未必最优。我们还证明,类似下界证明中的技术可用于说明:对于枚举Loomis-Whitney连接的结果,不可能显著优于在预处理阶段直接计算所有结果的平凡方法。这进一步为(基于零团猜想)无自连接循环连接在线性预处理与常数延迟条件下的枚举难解性提供了证据。