The article studies query evaluation in parallel constant time in the CRCW PRAM model. While it is well-known that all relational algebra queries can be evaluated in constant time on an appropriate CRCW PRAM model, this article is interested in the efficiency of evaluation algorithms, that is, in the number of processors or, asymptotically equivalent, in the work. Naive evaluation in the parallel setting results in huge (polynomial) bounds on the work of such algorithms and in presentations of the result sets that can be extremely scattered in memory. The article discusses some obstacles for constant-time PRAM query evaluation. It presents algorithms for relational operators and explores three settings, in which efficient sequential query evaluation algorithms exist: acyclic queries, semijoin algebra queries, and join queries -- the latter in the worst-case optimal framework. Under mild assumptions -- that data values are numbers of polynomial size in the size of the database or that the relations of the database are suitably sorted -- constant-time algorithms are presented that are weakly work-efficient in the sense that work $\mathcal{O}(T^{1+\varepsilon})$ can be achieved, for every $\varepsilon>0$, compared to the time $T$ of an optimal sequential algorithm. Important tools are the algorithms for approximate prefix sums and compaction from Goldberg and Zwick (1995).
翻译:文章研究了在CRCW PRAM模型中并行常数时间的查询评估问题。尽管众所周知,所有关系代数查询都可以在适当的CRCW PRAM模型上以常数时间进行评估,但本文关注的是评估算法的效率,即处理器的数量(或在渐近等价意义上的工作量)。在并行设置中的朴素评估会导致此类算法的工作量出现巨大的(多项式)上界,并且结果集在内存中的呈现可能极其分散。本文讨论了常数时间PRAM查询评估的一些障碍,提出了关系运算符的算法,并探索了存在高效顺序查询评估算法的三种场景:无环查询、半连接代数查询和连接查询——后者在最优最坏情况下框架内。在温和假设下(即数据值是数据库大小多项式规模的数值,或数据库的关系已适当排序),本文提出了常数时间算法,这些算法在弱工作高效意义上实现了:对于任意$\varepsilon>0$,与最优顺序算法的时间$T$相比,可以实现工作量$\mathcal{O}(T^{1+\varepsilon})$。重要的工具包括Goldberg和Zwick(1995)提出的近似前缀和与压缩算法。