Tight Bounds for Repeated Balls-into-Bins

We study the repeated balls-into-bins process introduced by Becchetti, Clementi, Natale, Pasquale and Posta (2019). This process starts with $m$ balls arbitrarily distributed across $n$ bins. At each round $t=1,2,\ldots$, one ball is selected from each non-empty bin, and then placed it into a bin chosen independently and uniformly at random. We prove the following results: $\quad \bullet$ For any $n \leq m \leq \mathrm{poly}(n)$, we prove a lower bound of $\Omega(m/n \cdot \log n)$ on the maximum load. For the special case $m=n$, this matches the upper bound of $O(\log n)$, as shown in [BCNPP19]. It also provides a positive answer to the conjecture in [BCNPP19] that for $m=n$ the maximum load is $\omega(\log n/ \log \log n)$ at least once in a polynomially large time interval. For $m\in [\omega(n),n\log n]$, our new lower bound disproves the conjecture in [BCNPP19] that the maximum load remains $O(\log n)$. $\quad \bullet$ For any $n\leq m\leq\mathrm{poly}(n)$, we prove an upper bound of $O(m/n\cdot\log n)$ on the maximum load for all steps of a polynomially large time interval. This matches our lower bound up to multiplicative constants. $\quad \bullet$ For any $m\geq n$, our analysis also implies an $O(m^2/n)$ waiting time to reach a configuration with a $O(m/n\cdot\log m)$ maximum load, even for worst-case initial distributions. $\quad \bullet$ For any $m \geq n$, we show that every ball visits every bin in $O(m\log m)$ rounds. For $m = n$, this improves the previous upper bound of $O(n \log^2 n)$ in [BCNPP19]. We also prove that the upper bound is tight up to multiplicative constants for any $n \leq m \leq \mathrm{poly}(n)$.

翻译：我们研究了Becchetti、Clementi、Natale、Pasquale和Posta（2019年）提出的重复球入箱过程。该过程从$m$个球任意分布在$n$个箱子开始。在每一轮$t=1,2,\ldots$中，从每个非空箱子中选取一个球，然后将其独立均匀随机地放入一个箱子。我们证明了以下结果： $\quad \bullet$ 对于任意$n \leq m \leq \mathrm{poly}(n)$，我们证明了最大负载的下界为$\Omega(m/n \cdot \log n)$。对于特殊情况$m=n$，这与$O(\log n)$的上界相匹配，如[BCNPP19]所示。这也正面回答了[BCNPP19]中的猜想，即在多项式大的时间区间内，$m=n$时最大负载至少一次达到$\omega(\log n/ \log \log n)$。对于$m\in [\omega(n),n\log n]$，我们的新下界推翻了[BCNPP19]中最大负载保持$O(\log n)$的猜想。 $\quad \bullet$ 对于任意$n\leq m\leq\mathrm{poly}(n)$，我们证明了在多项式大的时间区间的所有步内，最大负载的上界为$O(m/n\cdot\log n)$。这与我们的下界在乘法常数内匹配。 $\quad \bullet$ 对于任意$m\geq n$，我们的分析还表明，即使对于最坏情况的初始分布，达到最大负载为$O(m/n\cdot\log m)$的配置的等待时间为$O(m^2/n)$。 $\quad \bullet$ 对于任意$m \geq n$，我们证明了每个球在$O(m\log m)$轮内访问每个箱子。对于$m=n$，这改进了[BCNPP19]中$O(n \log^2 n)$的先前上界。我们还证明了对于任意$n \leq m \leq \mathrm{poly}(n)$，该上界在乘法常数内是紧的。