We study the fine-grained complexity of conjunctive queries with grouping and aggregation. For some common aggregate functions (e.g., min, max, count, sum), such a query can be phrased as an ordinary conjunctive query over a database annotated with a suitable commutative semiring. Specifically, we investigate the ability to evaluate such queries by constructing in log-linear time a data structure that provides logarithmic-time direct access to the answers ordered by a given lexicographic order. This task is nontrivial since the number of answers might be larger than log-linear in the size of the input, and so, the data structure needs to provide a compact representation of the space of answers. In the absence of aggregation and annotation, past research provides a sufficient tractability condition on queries and orders. For queries without self-joins, this condition is not just sufficient, but also necessary (under conventional lower-bound assumptions in fine-grained complexity). We show that all past results continue to hold for annotated databases, assuming that the annotation itself is not part of the lexicographic order. On the other hand, we show infeasibility for the case of count-distinct that does not have any efficient representation as a commutative semiring. We then investigate the ability to include the aggregate and annotation outcome in the lexicographic order. Among the hardness results, standing out as tractable is the case of a semiring with an idempotent addition, such as those of min and max. Notably, this case captures also count-distinct over a logarithmic-size domain.
翻译:我们研究了带有分组和聚合的合取查询在精细复杂度层面的性质。对于某些常见的聚合函数(例如min、max、count、sum),此类查询可表述为在带有合适交换半环标注的数据库上的普通合取查询。具体而言,我们探究了通过在近对数线性时间内构建数据结构来实现对数时间内按给定字典序直接访问查询答案的能力。由于答案数量可能超过输入规模的近对数线性,因此该数据结构需要对答案空间提供紧凑表示,这使得任务具有挑战性。在无聚合和标注的情况下,过往研究给出了查询和排序的充分可处理条件。对于不含自连接的查询,该条件不仅是充分的,而且是必要的(在精细复杂度中的常规下界假设下)。我们证明,只要标注本身不参与字典序排序,所有过往结果在标注数据库上依然成立。另一方面,我们展示了针对无法有效表示为交换半环的计数去重情况的不可能性。接着,我们探究了将聚合与标注结果纳入字典序排序的能力。在诸多困难结果中,具有幂等加法(如min和max对应的半环)的半环情形是可处理的显著特例。值得注意的是,该情形也涵盖了对数规模域上的计数去重操作。