DBSCAN is a well-known density-based clustering algorithm to discover arbitrary shape clusters. While conceptually simple in serial, the algorithm is challenging to efficiently parallelize on manycore GPU architectures. Common pitfalls, such as asynchronous range query calls, result in high thread execution divergence in many implementations. In this paper, we propose a new framework for GPU-accelerated DBSCAN, and describe two tree-based algorithms within that framework. Both algorithms fuse the search for neighbors with updating cluster information, but differ in their treatment of dense regions of the data. We show that the time taken to compute clusters is at most twice that of determination of the neighbors. We compare the proposed algorithms with existing CPU and GPU implementations, and demonstrate their competitiveness and performance using a fast traversal structure (bounding volume hierarchy) for low dimensional data. We also show that the memory usage can be reduced by processing object neighbors dynamically without storing them.
翻译:DBSCAN是一种广为人知的基于密度的聚类算法,用于发现任意形状的簇。尽管该算法在串行实现中概念简单,但在多核GPU架构上高效并行化却极具挑战性。常见的陷阱(例如异步范围查询调用)会在许多实现中导致极高的线程执行分歧。本文提出了一种新的GPU加速DBSCAN框架,并描述了该框架下的两种基于树的算法。两种算法均将邻居搜索与簇信息更新相融合,但在处理数据密集区域的方式上有所不同。我们证明,计算簇所需的时间最多不超过确定邻居所需时间的两倍。我们将所提出的算法与现有的CPU和GPU实现进行了比较,并展示了其在使用快速遍历结构(边界体积层次结构)处理低维数据时的竞争力和性能。我们还表明,通过动态处理对象邻居而不存储它们,可以减少内存使用量。