Abstract semantics has proven to be instrumental for accelerating search-based program synthesis, by enabling the sound pruning of a set of incorrect programs (without enumerating them). One may expect faster synthesis with increasingly finer-grained abstract semantics. Unfortunately, to the best of our knowledge, this is not the case, yet. The reason is because, as abstraction granularity increases -- while fewer programs are enumerated -- pruning becomes more costly. This imposes a fundamental limit on the overall synthesis performance, which we aim to address in this work. Our key idea is to introduce an offline presynthesis phase, which consists of two steps. Given a DSL with abstract semantics, the first semantics modeling step constructs a tree automaton A for a space of inputs -- such that, for any program P and for any considered input I, A has a run that corresponds to P's execution on I under abstract semantics. Then, the second step builds an oracle O for A. This O enables fast pruning during synthesis, by allowing us to efficiently find exactly those DSL programs that satisfy a given input-output example under abstract semantics. We have implemented this presynthesis-based synthesis paradigm in a framework, Foresighter. On top of it, we have developed three instantiations for SQL, string transformation, and matrix manipulation. All of them significantly outperform prior work in the respective domains.
翻译:抽象语义通过合理剪枝错误程序集(无需枚举),已被证明是加速搜索式程序合成的关键工具。直观而言,采用更细粒度的抽象语义有望实现更快速的合成。然而,据我们所知,目前尚未达到这一预期。原因在于:随着抽象粒度提升,虽然需要枚举的程序减少,但剪枝成本随之增加。这从根本上限制了整体合成性能,而本工作旨在解决该问题。我们的核心思路是引入离线预合成阶段,该阶段包含两个步骤。给定具有抽象语义的领域特定语言,第一步语义建模为输入空间构建树自动机A——使得对于任意程序P和任意输入I,A均存在对应于P在抽象语义下执行过程的运行路径。随后第二步为A构建预言机O。该O通过高效定位满足给定输入输出示例的抽象语义程序,实现了合成过程中的快速剪枝。我们基于该预合成范式实现了框架Foresighter,并在此基础上开发了SQL、字符串转换与矩阵操作三个具体实例。实验表明,这三个实例均显著优于各自领域的既有工作。