We present an algorithm that allows for building left-balanced and complete k-d trees over k-dimensional points in a trivially parallel and GPU friendly way. Our algorithm requires exactly one int per data point as temporary storage, and uses O(log N) iterations, each of which performs one parallel sort, and one trivially parallel CUDA per-node update kernel.
翻译:我们提出了一种算法,能够以简单并行且GPU友好的方式构建覆盖k维点的左平衡且完整的k-d树。该算法每个数据点仅需一个整数作为临时存储,并执行O(log N)次迭代,每次迭代包含一次并行排序和一次简单并行的CUDA逐节点更新内核。