Identifying Minimal Changes in the Zone Abstract Domain

Verification techniques express program states as logical formulas over program variables. For example, symbolic execution and abstract interpretation encode program states as a set of integer inequalities. However, for real-world programs these formulas tend to become large, which affects scalability of analyses. To address this problem, researchers developed complementary approaches which either remove redundant inequalities or extract a subset of inequalities sufficient for specific reasoning. For arbitrary integer inequalities, such reduction approaches either have high complexities or over-approximate. However, efficiency and precision of these approaches can be improved for a restricted type of logical formulas used in relational numerical abstract domains. While previous work investigated custom efficient redundant inequality elimination for Zones states, our work examines custom semantic slicing algorithms that identify a minimal set of changed inequalities in Zones states. The client application of the minimal changes in Zones is an empirical study on comparison between invariants computed by data-flow analysis using Zones, Intervals and Predicates numerical domains. In particular, evaluations compare how our proposed algorithms affect the precision of comparing Zones vs. Intervals and Zones vs. Predicates abstract domains. The results show our techniques reduce the number of variables by more than 70% and the number of inequalities by 30%, compared to full states. The approach refines the granularity of comparison between domains, reducing incomparable invariants between Zones and Predicates from 52% to 4%, and increases equality of Intervals and Zones, invariants from 27% to 71%. The techniques improve the comparison efficiency by reducing total runtime for all subject comparisons for Zones and Predicates from over 4 minutes to a few seconds.

翻译：验证技术将程序状态表示为关于程序变量的逻辑公式。例如，符号执行和抽象解释将程序状态编码为一组整数不等式。然而，对于真实程序，这些公式往往变得庞大，从而影响分析的可扩展性。为解决此问题，研究人员开发了互补方法，这些方法要么移除冗余不等式，要么提取足以用于特定推理的不等式子集。对于任意整数不等式，此类约简方法要么复杂度较高，要么存在过近似。但对于关系型数值抽象域中使用的受限逻辑公式，这些方法的效率和精度可以得到提升。先前的工作研究了针对区域状态的定制化高效冗余不等式消除方法，而我们的工作则检验了定制的语义切片算法，用于识别区域状态中变化的最小不等式集合。区域最小变化的应用场景是一项实证研究，该研究比较了使用区域、区间和谓词数值域通过数据流分析计算得到的不变量。具体而言，评估比较了我们的算法如何影响区域与区间、区域与谓词抽象域之间比较的精度。结果表明，与完整状态相比，我们的技术将变量数量减少了70%以上，不等式数量减少了30%。该方法细化了域间比较的粒度，将区域与谓词之间不可比较的不变量从52%降至4%，并将区间与区域之间相等的不变量从27%提升至71%。该技术通过将所有主题比较的运行时（区域与谓词）从4分钟以上降至数秒，提高了比较效率。