$O(1)$-Round MPC Algorithms for Multi-dimensional Grid Graph Connectivity, EMST and DBSCAN

In this paper, we investigate three fundamental problems in the Massively Parallel Computation (MPC) model: (i) grid graph connectivity, (ii) approximate Euclidean Minimum Spanning Tree (EMST), and (iii) approximate DBSCAN. Our first result is a $O(1)$-round Las Vegas (i.e., succeeding with high probability) MPC algorithm for computing the connected components on a $d$-dimensional $c$-penetration grid graph ($(d,c)$-grid graph), where both $d$ and $c$ are positive integer constants. In such a grid graph, each vertex is a point with integer coordinates in $\mathbb{N}^d$, and an edge can only exist between two distinct vertices with $\ell_\infty$-norm at most $c$. To our knowledge, the current best existing result for computing the connected components (CC's) on $(d,c)$-grid graphs in the MPC model is to run the state-of-the-art MPC CC algorithms that are designed for general graphs: they achieve $O(\log \log n + \log D)$[FOCS19] and $O(\log \log n + \log \frac{1}{\lambda})$[PODC19] rounds, respectively, where $D$ is the {\em diameter} and $\lambda$ is the {\em spectral gap} of the graph. With our grid graph connectivity technique, our second main result is a $O(1)$-round Las Vegas MPC algorithm for computing approximate Euclidean MST. The existing state-of-the-art result on this problem is the $O(1)$-round MPC algorithm proposed by Andoni et al.[STOC14], which only guarantees an approximation on the overall weight in expectation. In contrast, our algorithm not only guarantees a deterministic overall weight approximation, but also achieves a deterministic edge-wise weight approximation.The latter property is crucial to many applications, such as finding the Bichromatic Closest Pair and DBSCAN clustering. Last but not the least, our third main result is a $O(1)$-round Las Vegas MPC algorithm for computing an approximate DBSCAN clustering in $O(1)$-dimensional space.

翻译：本文研究大规模并行计算（MPC）模型中的三个基本问题：(i) 网格图连通性，(ii) 近似欧几里得最小生成树（EMST），以及(iii) 近似DBSCAN。我们的第一个成果是一个$O(1)$轮的拉斯维加斯（即以高概率成功）MPC算法，用于计算$d$维$c$穿透网格图（$(d,c)$-网格图）的连通分量，其中$d$和$c$均为正常整数常数。在此类网格图中，每个顶点是$\mathbb{N}^d$中具有整数坐标的点，且仅当两个不同顶点间的$\ell_\infty$范数不超过$c$时，它们之间才可能存在边。据我们所知，当前在MPC模型中计算$(d,c)$-网格图连通分量（CC）的最佳现有结果是运行专为一般图设计的最先进MPC连通分量算法：它们分别需要$O(\log \log n + \log D)$[FOCS19]轮和$O(\log \log n + \log \frac{1}{\lambda})$[PODC19]轮，其中$D$是图的{\em直径}，$\lambda$是图的{\em谱间隙}。利用我们的网格图连通性技术，我们的第二个主要成果是一个$O(1)$轮的拉斯维加斯MPC算法，用于计算近似欧几里得最小生成树。该问题现有的最先进结果是由Andoni等人[STOC14]提出的$O(1)$轮MPC算法，该算法仅能保证对整体权重的期望近似。相比之下，我们的算法不仅保证了整体权重的确定性近似，还实现了边权重的确定性近似。后一特性对于许多应用至关重要，例如寻找双色最近点对和DBSCAN聚类。最后但同样重要的是，我们的第三个主要成果是一个$O(1)$轮的拉斯维加斯MPC算法，用于在$O(1)$维空间中计算近似的DBSCAN聚类。