When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, we can measure whether they compositionally generalize, that is, whether a model that has been trained on the simpler subtasks is subsequently able to solve more complex tasks. In this paper, we characterize several different forms of compositional generalization that are desirable in program synthesis, forming a meta-benchmark which we use to create generalization tasks for two popular datasets, RobustFill and DeepCoder. We then propose ExeDec, a novel decomposition-based synthesis strategy that predicts execution subgoals to solve problems step-by-step informed by program execution at each step. ExeDec has better synthesis performance and greatly improved compositional generalization ability compared to baselines.
翻译:在编写程序时,人类能够通过将复杂的新任务分解为更小且更熟悉的子任务来加以应对。虽然难以衡量神经程序合成方法是否具备类似能力,但我们可以评估其是否具有组合泛化能力——即模型在简单子任务上训练后,能否进一步解决更复杂的任务。本文刻画了程序合成中几种值得期望的不同形式的组合泛化,构建了一个元基准,并据此为两个流行数据集RobustFill和DeepCoder创建了泛化任务。我们进一步提出ExeDec,一种新颖的基于分解的合成策略,该策略通过预测执行子目标,并借助每一步的程序执行信息逐步解决问题。与基线方法相比,ExeDec展现出更优的合成性能及大幅提升的组合泛化能力。