Deep Dive into Probabilistic Delta Debugging: Insights and Simplifications

Given a list L of elements and a property that L exhibits, ddmin is a well-known test input minimization algorithm designed to automatically eliminate irrelevant elements from L. This algorithm is extensively adopted in test input minimization and software debloating. Recently, ProbDD, an advanced variant of ddmin, has been proposed and achieved state-of-the-art performance. Employing Bayesian optimization, ProbDD predicts the likelihood of each element in L being essential, and statistically decides which elements and how many should be removed each time. Despite its impressive results, the theoretical probabilistic model of ProbDD is complex, and the specific factors driving its superior performance have not been investigated. In this paper, we conduct the first in-depth theoretical analysis of ProbDD, clarifying trends in probability and subset size changes while simplifying the probability model. Complementing this analysis, we perform empirical experiments, including success rate analysis, ablation studies, and analysis on trade-offs and limitations, to better understand and demystify this state-of-the-art algorithm. Our success rate analysis shows how ProbDD addresses bottlenecks of ddmin by skipping inefficient queries that attempt to delete complements of subsets and previously tried subsets. The ablation study reveals that randomness in ProbDD has no significant impact on efficiency. Based on these findings, we propose CDD, a simplified version of ProbDD, reducing complexity in both theory and implementation. Besides, the performance of CDD validates our key findings. Comprehensive evaluations across 76 benchmarks in test input minimization and software debloating show that CDD can achieve the same performance as ProbDD despite its simplification. These insights provide valuable guidance for future research and applications of test input minimization algorithms.

翻译：给定一个元素列表L及其所展现的某种性质，ddmin是一种著名的测试输入最小化算法，旨在自动从L中消除无关元素。该算法在测试输入最小化和软件去膨胀领域被广泛采用。最近，ProbDD作为ddmin的先进变体被提出，并取得了最先进的性能。通过采用贝叶斯优化，ProbDD预测L中每个元素为必需的概率，并统计地决定每次应移除哪些元素以及移除多少元素。尽管其成果显著，ProbDD的理论概率模型较为复杂，且驱动其优越性能的具体因素尚未得到深入研究。本文首次对ProbDD进行了深入的理论分析，阐明了概率与子集大小变化的趋势，同时简化了概率模型。作为该分析的补充，我们进行了实证实验，包括成功率分析、消融研究以及权衡与局限性分析，以更好地理解和揭示这一最先进算法的本质。我们的成功率分析表明，ProbDD通过跳过尝试删除子集补集及先前已尝试子集的低效查询，解决了ddmin的瓶颈。消融研究揭示，ProbDD中的随机性对其效率无显著影响。基于这些发现，我们提出了CDD——ProbDD的简化版本，在理论和实现层面均降低了复杂性。此外，CDD的性能验证了我们的关键发现。在测试输入最小化和软件去膨胀的76个基准测试上的综合评估表明，尽管经过简化，CDD仍能达到与ProbDD相同的性能。这些洞见为测试输入最小化算法的未来研究和应用提供了宝贵的指导。