Given a value computed within a program, an idempotent backward slice with respect to this value is a maximal subprogram that computes it. An informal notion of an idempotent slice has previously been used by Guimaraes et al. to transform eager into strict evaluation in the LLVM intermediate representation. However, that algorithm is insufficient to be correctly applied to general control-flow graphs. This paper addresses these omissions by formalizing the notion of idempotent backward slices and presenting a sound and efficient algorithm for extracting them from programs in Gated Static Single Assignment (GSA) form. As an example of their practical use, the paper describes how identifying and extracting idempotent backward slices enables a sparse code-size reduction optimization; that is, one capable of merging non-contiguous sequences of instructions within the control-flow graph of a single function or across functions. Experiments with the LLVM test suite show that, in specific benchmarks, this new algorithm achieves code-size reductions up to -7.24% on programs highly optimized by the -Os sequence of passes from clang 17.
翻译:给定程序中计算的某个值,关于该值的幂等后向切片是计算该值的最大子程序。Guimaraes等人先前在LLVM中间表示中将急切求值转换为严格求值时,曾使用过幂等切片的非形式化概念。然而,该算法不足以正确应用于一般控制流图。本文通过形式化幂等后向切片的概念,并提出一种从门控静态单赋值形式程序中提取幂等切片的可靠高效算法,以解决上述不足。作为其实际应用的示例,本文阐述了如何通过识别和提取幂等后向切片实现稀疏代码规模缩减优化;该优化能够合并单个函数内或跨函数控制流图中的非连续指令序列。在LLVM测试套件上的实验表明,在特定基准测试中,对于经过clang 17的-Os编译序列高度优化的程序,该新算法可实现高达-7.24%的代码规模缩减。