WDD: Weighted Delta Debugging

Delta Debugging is a widely used family of algorithms (e.g., ddmin and ProbDD) to automatically minimize bug-triggering test inputs, thus to facilitate debugging. It takes a list of elements with each element representing a fragment of the test input, systematically partitions the list at different granularities, identifies and deletes bug-irrelevant partitions. Prior delta debugging algorithms assume there are no differences among the elements in the list, and thus treat them uniformly during partitioning. However, in practice, this assumption usually does not hold, because the size (referred to as weight) of the fragment represented by each element can vary significantly. For example, a single element representing 50% of the test input is much more likely to be bug-relevant than elements representing only 1%. This assumption inevitably impairs the efficiency or even effectiveness of these delta debugging algorithms. This paper proposes Weighted Delta Debugging (WDD), a novel concept to help prior delta debugging algorithms overcome the limitation mentioned above. The key insight of WDD is to assign each element in the list a weight according to its size, and distinguish different elements based on their weights during partitioning. We designed two new minimization algorithms, Wddmin and WProbDD, by applying WDD to ddmin and ProbDD respectively. We extensively evaluated Wddmin and WProbDD in two representative applications, HDD and Perses, on 62 benchmarks across two languages. The results strongly demonstrate the value of WDD. We firmly believe that WDD opens up a new dimension to improve test input minimization techniques.

翻译：Delta调试是一类广泛使用的算法家族（例如ddmin和ProbDD），用于自动最小化触发错误的测试输入，从而辅助调试。该算法将测试输入的每个片段表示为列表中的一个元素，系统地在不同粒度上划分列表，识别并删除与错误无关的划分部分。现有的Delta调试算法假设列表中所有元素之间不存在差异，因此在划分过程中对它们进行统一处理。然而，在实际应用中，这一假设通常不成立，因为每个元素所代表的片段大小（称为权重）可能存在显著差异。例如，一个代表测试输入50%的单个元素比仅代表1%的元素更可能与错误相关。这一假设不可避免地损害了这些Delta调试算法的效率甚至有效性。本文提出加权Delta调试（WDD），这是一种新颖的概念，旨在帮助现有Delta调试算法克服上述局限性。WDD的核心思想是根据每个元素所代表片段的大小为其分配权重，并在划分过程中依据权重区分不同元素。通过将WDD分别应用于ddmin和ProbDD，我们设计了两种新的最小化算法：Wddmin和WProbDD。我们在HDD和Perses这两个代表性应用中，对跨两种语言的62个基准测试进行了广泛评估，结果充分证明了WDD的价值。我们坚信WDD为提高测试输入最小化技术开辟了新的维度。