Probabilistic recurrence relations (PRRs) are a standard formalism for describing the runtime of a randomized algorithm. Given a PRR and a time limit $\kappa$, we consider the classical concept of tail probability $\Pr[T \ge \kappa]$, i.e., the probability that the randomized runtime $T$ of the PRR exceeds the time limit $\kappa$. Our focus is the formal analysis of tail bounds that aims at finding a tight asymptotic upper bound $u \geq \Pr[T\ge\kappa]$ in the time limit $\kappa$. To address this problem, the classical and most well-known approach is the cookbook method by Karp (JACM 1994), while other approaches are mostly limited to deriving tail bounds of specific PRRs via involved custom analysis. In this work, we propose a novel approach for deriving exponentially-decreasing tail bounds (a common type of tail bounds) for PRRs whose preprocessing time and random passed sizes observe discrete or (piecewise) uniform distribution and whose recursive call is either a single procedure call or a divide-and-conquer. We first establish a theoretical approach via Markov's inequality, and then instantiate the theoretical approach with a template-based algorithmic approach via a refined treatment of exponentiation. Experimental evaluation shows that our algorithmic approach is capable of deriving tail bounds that are (i) asymptotically tighter than Karp's method, (ii) match the best-known manually-derived asymptotic tail bound for QuickSelect, and (iii) is only slightly worse (with a $\log\log n$ factor) than the manually-proven optimal asymptotic tail bound for QuickSort. Moreover, our algorithmic approach handles all examples (including realistic PRRs such as QuickSort, QuickSelect, DiameterComputation, etc.) in less than 0.1 seconds, showing that our approach is efficient in practice.
翻译:概率递推关系是描述随机算法运行时间的标准形式。给定一个概率递推关系和时间上限$\kappa$,我们考虑经典的尾部概率概念$\Pr[T \ge \kappa]$,即该概率递推关系中随机运行时间$T$超过时间上限$\kappa$的概率。本文重点研究尾部界的正式分析,旨在寻找时间上限$\kappa$下紧的渐近上界$u \geq \Pr[T\ge\kappa]$。针对这一问题,最经典且广为人知的方法是Karp(JACM 1994)的"菜谱式"方法,而其他方法大多局限于通过复杂的定制分析推导特定概率递推关系的尾部界。本文提出了一种新方法,用于推导预处理时间和随机传递规模服从离散或(分段)均匀分布,且递归调用为单过程调用或分治结构的概率递推关系指数衰减型尾部界(一种常见尾部界)。我们首先通过马尔可夫不等式建立理论方法,然后通过指数运算的精细化处理,将理论方法实例化为基于模板的算法方法。实验评估表明,我们的算法方法能够推导出(i)渐近紧于Karp方法的尾部界,(ii)与QuickSelect的最佳已知手动推导渐近尾部界相匹配,且(iii)仅比QuickSort手动证明的最优渐近尾部界略差(相差$\log\log n$因子)。此外,我们的算法方法能在0.1秒内处理所有示例(包括QuickSort、QuickSelect、DiameterComputation等实际概率递推关系),证明了该方法在实践中具有高效性。