Program synthesis is the task of automatically generating code based on a specification. In Syntax-Guided Synthesis (SyGuS) this specification is a combination of a syntactic template and a logical formula, and the result is guaranteed to satisfy both. We present a reinforcement-learning guided algorithm for SyGuS which uses Monte-Carlo Tree Search (MCTS) to search the space of candidate solutions. Our algorithm learns policy and value functions which, combined with the upper confidence bound for trees, allow it to balance exploration and exploitation. A common challenge in applying machine learning approaches to syntax-guided synthesis is the scarcity of training data. To address this, we present a method for automatically generating training data for SyGuS based on anti-unification of existing first-order satisfiability problems, which we use to train our MCTS policy. We implement and evaluate this setup and demonstrate that learned policy and value improve the synthesis performance over a baseline by over 26 percentage points in the training and testing sets. Our tool outperforms state-of-the-art tool cvc5 on the training set and performs comparably in terms of the total number of problems solved on the testing set (solving 23% of the benchmarks on which cvc5 fails). We make our data set publicly available, to enable further application of machine learning methods to the SyGuS problem.
翻译:程序综合是指根据规范自动生成代码的任务。在语法引导综合(SyGuS)中,该规范由语法模板和逻辑公式组成,生成的结果需同时满足两者。我们提出了一种基于强化学习的语法引导综合算法,该算法采用蒙特卡洛树搜索(MCTS)来探索候选解空间。该算法通过学习策略函数和值函数,结合树的上置信界(UCB)方法,能有效平衡探索与利用。将机器学习方法应用于语法引导综合时,训练数据匮乏是常见挑战。为此,我们提出了一种基于反合一(anti-unification)技术从现有的一阶可满足性问题中自动生成SyGuS训练数据的方法,并以此训练MCTS策略函数。我们实现并评估了该框架,实验表明,在训练集和测试集上,学习得到的策略与值函数使综合性能较基准方法提升超过26个百分点。我们的工具在训练集上超越了当前最先进的工具cvc5,在测试集上求解问题总数与之相当(解决了cvc5未能处理的23%的基准问题)。为促进机器学习方法在SyGuS问题中的进一步应用,我们公开了相关数据集。