Work-Efficient Parallel Derandomization II: Optimal Concentrations via Bootstrapping

We present an efficient parallel derandomization method for randomized algorithms that rely on concentrations such as the Chernoff bound. This settles a classic problem in parallel derandomization, which dates back to the 1980s. Consider the \textit{set balancing} problem where $m$ sets of size at most $s$ are given in a ground set of size $n$, and we should partition the ground set into two parts such that each set is split evenly up to a small additive (discrepancy) bound. A random partition achieves a discrepancy of $O(\sqrt{s \log m})$ in each set, by Chernoff bound. We give a deterministic parallel algorithm that matches this bound, using near-linear work and polylogarithmic depth. The previous results were weaker in discrepancy and/or work bounds: Motwani, Naor, and Naor [FOCS'89] and Berger and Rompel [FOCS'89] achieve discrepancy $s^{\varepsilon} \cdot O(\sqrt{s \log m})$ with work $\tilde{O}(m+n+\sum_{i=1}^{m} |S_i|) \cdot m^{\Theta(1/\varepsilon)}$ and polylogarithmic depth; the discrepancy was optimized to $O(\sqrt{s \log m})$ in later work, e.g. by Harris [Algorithmica'19], but the work bound remained high at $\tilde{O}(m^4n^3)$. Ghaffari, Grunau, and Rozhon [FOCS'23] achieve discrepancy $s/poly(\log(nm)) + O(\sqrt{s \log m})$ with near-linear work and polylogarithmic-depth. Notice that this discrepancy is barely sublinear with respect to the trivial bound of $s$. Our method relies on a novel bootstrapping idea that uses crude partitioning algorithms as a subroutine. In particular, we solve the problem recursively, by using the crude partition in each iteration to split the variables into many smaller parts, and then we find a constraint for the variables in each part such that we reduce the overall number of variables in the problem. The scheme relies on an interesting application of the multiplicative weights update method to control the variance losses in each iteration.

翻译：我们针对依赖Chernoff界等集中性的随机算法，提出了一种高效的并行去随机化方法，解决了可追溯至20世纪80年代的经典并行去随机化问题。考虑集合平衡问题：在一个规模为$n$的基集上给定$m$个大小不超过$s$的集合，需将基集划分为两部分，使得每个集合的划分误差（差异）不超过一个小的可加界。根据Chernoff界，随机划分可在每个集合上实现$O(\sqrt{s \log m})$的差异。我们给出一种确定性并行算法，在近线性工作量和多对数深度下达到该界。先前的结果在差异和/或工作量界上较弱：Motwani、Naor和Naor [FOCS'89]以及Berger和Rompel [FOCS'89]以工作$\tilde{O}(m+n+\sum_{i=1}^{m} |S_i|) \cdot m^{\Theta(1/\varepsilon)}$和多对数深度达到差异$s^{\varepsilon} \cdot O(\sqrt{s \log m})$；后续研究（如Harris [Algorithmica'19]）将差异优化至$O(\sqrt{s \log m})$，但工作量界仍高达$\tilde{O}(m^4n^3)$。Ghaffari、Grunau和Rozhon [FOCS'23] 以近线性工作量和多对数深度达到差异$s/poly(\log(nm)) + O(\sqrt{s \log m})$。注意该差异仅略次于平凡界$s$。我们的方法依赖于一种新颖的自举思想，将粗糙划分算法作为子程序。具体而言，我们递归地解决问题：每次迭代中使用粗糙划分将变量拆分为多个更小的部分，然后为每个部分的变量寻找约束条件，从而减少问题中的总变量数。该方案依赖于乘性权重更新方法的有趣应用，以控制每次迭代中的方差损失。