The article studies query evaluation in parallel constant time in the CRCW PRAM model. While it is well-known that all relational algebra queries can be evaluated in constant time on an appropriate CRCW PRAM model, this article is interested in the efficiency of evaluation algorithms, that is, in the number of processors or, asymptotically equivalent, in the work. Naive evaluation in the parallel setting results in huge (polynomial) bounds on the work of such algorithms and in presentations of the result sets that can be extremely scattered in memory. The article discusses some obstacles for constant-time PRAM query evaluation. It presents algorithms for relational operators and explores three settings, in which efficient sequential query evaluation algorithms exist: acyclic queries, semijoin algebra queries, and join queries -- the latter in the worst-case optimal framework. Under mild assumptions -- that data values are numbers of polynomial size in the size of the database or that the relations of the database are suitably sorted -- constant-time algorithms are presented that are weakly work-efficient in the sense that work $\mathcal{O}(T^{1+\varepsilon})$ can be achieved, for every $\varepsilon>0$, compared to the time $T$ of an optimal sequential algorithm. Important tools are the algorithms for approximate prefix sums and compaction from Goldberg and Zwick (1995).
翻译:本文研究在CRCW PRAM模型下并行常量时间内的查询求值问题。尽管已知所有关系代数查询均可在适当的CRCW PRAM模型上于常量时间内求值,本文关注的是求值算法的效率,即所需处理器数量(或其渐进等价的工作量)。朴素并行求值会导致此类算法的工作量呈现巨大(多项式级)上界,且结果集的存储分布可能极度分散。本文探讨了常量时间PRAM查询求值面临的部分障碍,提出了关系运算符的算法,并研究了存在高效顺序查询求值算法的三种场景:无环查询、半连接代数查询以及连接查询(后者在最坏情况最优框架下)。在适度假设下(数据值大小为数据库规模的多项式级数值,或数据库关系已恰当排序),本文给出了常量时间算法,这些算法在弱工作高效性意义上可实现:对任意$\varepsilon>0$,相较于最优顺序算法的时间$T$,工作量为$\mathcal{O}(T^{1+\varepsilon})$。关键工具来自Goldberg与Zwick(1995)提出的近似前缀和算法与压缩算法。