Abstract semantics has proven to be instrumental for accelerating search-based program synthesis, by enabling the sound pruning of a set of incorrect programs (without enumerating them). One may expect faster synthesis with increasingly finer-grained abstract semantics. Unfortunately, to the best of our knowledge, this is not the case, yet. The reason is because, as abstraction granularity increases -- while fewer programs are enumerated -- pruning becomes more costly. This imposes a fundamental limit on the overall synthesis performance, which we aim to address in this work. Our key idea is to introduce an offline presynthesis phase, which consists of two steps. Given a DSL with abstract semantics, the first semantics modeling step constructs a tree automaton A for a space of inputs -- such that, for any program P and for any considered input I, A has a run that corresponds to P's execution on I under abstract semantics. Then, the second step builds an oracle O for A. This O enables fast pruning during synthesis, by allowing us to efficiently find exactly those DSL programs that satisfy a given input-output example under abstract semantics. We have implemented this presynthesis-based synthesis paradigm in a framework, Foresighter. On top of it, we have developed three instantiations for SQL, string transformation, and matrix manipulation. All of them significantly outperform prior work in the respective domains.
翻译:抽象语义通过声学剪枝一组错误程序(无需逐一枚举),已被证明对加速基于搜索的程序合成至关重要。人们可能期望随着抽象语义的细粒度提升,合成速度会加快。然而据我们所知,目前情况并非如此。原因在于:当抽象粒度增加时,虽然枚举的程序数量减少,但剪枝成本显著上升。这从根本上限制了整体合成性能,而本研究正致力于解决这一问题。我们的核心思想是引入一个离线预合成阶段,该阶段包含两个步骤。给定一个具有抽象语义的领域特定语言(DSL),第一步语义建模为输入空间构建一个树自动机A——使得对于任意程序P和任意考虑的输入I,A都能产生一个与P在抽象语义下的执行轨迹相对应的运行路径。第二步则为A构建一个预言机O。该预言机通过使我们能够高效地精确找出在抽象语义下满足给定输入输出示例的DSL程序,从而实现合成过程中的快速剪枝。我们已在名为Foresighter的框架中实现了这种基于预合成的合成范式。在此基础上,我们为SQL语言、字符串变换和矩阵操作开发了三个具体实例化方案。所有方案均显著超越了各自领域先前工作的性能表现。