Prefix aggregation operation (also called scan), and its particular case, prefix summation, is an important parallel primitive and enjoys a lot of attention in the research literature. It is also used in many algorithms as one of the steps. Aggregation over dominated points in $\mathbb{R}^m$ is a multidimensional generalisation of prefix aggregation. It is also intensively researched, both as a parallel primitive and as a practical problem, encountered in computational geometry, spatial databases and data warehouses. In this paper we show that, for a constant dimension $m$, aggregation over dominated points in $\mathbb{R}^m$ can be computed by $O(1)$ basic operations that include sorting the whole dataset, zipping sorted lists of elements, computing prefix aggregations of lists of elements and flat maps, which expand the data size from initial $n$ to $n\log^{m-1}n$. Thereby we establish that prefix aggregation suffices to express aggregation over dominated points in more dimensions, even though the latter is a far-reaching generalisation of the former. Many problems known to be expressible by aggregation over dominated points become expressible by prefix aggregation, too. We rely on a small set of primitive operations which guarantee an easy transfer to various distributed architectures and some desired properties of the implementation.
翻译:前缀聚合操作(也称为扫描)及其特例——前缀求和,是一种重要的并行原语,在学术文献中备受关注,并被广泛用作许多算法的中间步骤。$\mathbb{R}^m$空间中支配点上的聚合操作是前缀聚合的多维推广。该问题作为并行原语和实际应用问题(涉及计算几何、空间数据库与数据仓库)均得到深入研究。本文证明:对于恒定维度$m$,$\mathbb{R}^m$空间中支配点上的聚合操作可通过$O(1)$个基本操作完成,这些操作包括对整体数据集排序、压缩已排序元素列表、计算元素列表的前缀聚合,以及将数据规模从初始$n$扩展至$n\log^{m-1}n$的扁平映射。由此我们建立:尽管支配点聚合是前缀聚合的深远推广,但前者可通过后者在更高维度中表达。许多已知可用支配点聚合表达的问题,亦可通过前缀聚合表达。我们采用的小规模基本操作集确保了该方法可便捷迁移至各类分布式架构,并满足实现的若干期望属性。