Tight Fine-Grained Bounds for Direct Access on Join Queries

We consider the task of lexicographic direct access to query answers. That is, we want to simulate an array containing the answers of a join query sorted in a lexicographic order chosen by the user. A recent dichotomy showed for which queries and orders this task can be done in polylogarithmic access time after quasilinear preprocessing, but this dichotomy does not tell us how much time is required in the cases classified as hard. We determine the preprocessing time needed to achieve polylogarithmic access time for all join queries and all lexicographical orders. To this end, we propose a decomposition-based general algorithm for direct access on join queries. We then explore its optimality by proving lower bounds for the preprocessing time based on the hardness of a certain online Set-Disjointness problem, which shows that our algorithm's bounds are tight for all lexicographic orders on join queries. Then, we prove the hardness of Set-Disjointness based on the Zero-Clique Conjecture which is an established conjecture from fine-grained complexity theory. Interestingly, while proving our lower bound, we show that self-joins do not affect the complexity of direct access (up to logarithmic factors). Our algorithm can also be used to solve queries with projections and relaxed order requirements, though in these cases, its running time not necessarily optimal. We also show that similar techniques to those used in our lower bounds can be used to prove that, for enumerating answers to Loomis-Whitney joins, it is not possible to significantly improve upon trivially computing all answers at preprocessing. This, in turn, gives further evidence (based on the Zero-Clique Conjecture) to the enumeration hardness of self-join free cyclic joins with respect to linear preprocessing and constant delay.

翻译：我们考虑对查询答案进行字典序直接访问的任务。即，我们希望模拟一个包含连接查询答案的数组，这些答案按用户选择的字典序排序。近期的一个二分法指出了哪些查询和顺序可以在拟线性预处理后实现多对数访问时间，但该二分法并未告诉我们被归类为困难情况时需要多少时间。我们确定了为所有连接查询和所有字典序实现多对数访问时间所需的预处理时间。为此，我们提出了一种基于分解的通用算法，用于连接查询上的直接访问。然后，我们通过基于特定在线集合不相交问题的难度证明预处理时间的下界，从而探索其最优性，这表明我们的算法对所有连接查询上的字典序的界限是紧致的。接着，我们基于零团猜想（来自细粒度复杂性理论的既定猜想）证明了集合不相交问题的难度。有趣的是，在证明下界的过程中，我们表明自连接不会影响直接访问的复杂度（在对数因子范围内）。我们的算法还可用于解决带有投影和松弛顺序要求的查询，尽管在这些情况下，其运行时间不一定是最优的。我们还证明，下界中使用的类似技术可用于证明：对于枚举Loomis-Whitney连接的答案，在预处理阶段简单地计算所有答案无法显著改进。这反过来为（基于零团猜想）关于无自连接循环连接在线性预处理和常数延迟下枚举困难性提供了进一步证据。