Generative neural models hold great promise in enhancing programming education by synthesizing new content. We seek to design neural models that can automatically generate programming tasks for a given specification in the context of visual programming domains. Despite the recent successes of large generative models like GPT-4, our initial results show that these models are ineffective in synthesizing visual programming tasks and struggle with logical and spatial reasoning. We propose a novel neuro-symbolic technique, NeurTaskSyn, that can synthesize programming tasks for a specification given in the form of desired programming concepts exercised by its solution code and constraints on the visual task. NeurTaskSyn has two components: the first component is trained via imitation learning procedure to generate possible solution codes, and the second component is trained via reinforcement learning procedure to guide an underlying symbolic execution engine that generates visual tasks for these codes. We demonstrate the effectiveness of NeurTaskSyn through an extensive empirical evaluation and a qualitative study on reference tasks taken from the Hour of Code: Classic Maze challenge by Code-dot-org and the Intro to Programming with Karel course by CodeHS-dot-com.
翻译:生成式神经模型通过合成新内容在增强编程教育方面展现出巨大潜力。我们致力于设计能够根据特定规范在可视化编程领域自动生成编程任务的神经模型。尽管GPT-4等大型生成模型近期取得了成功,但初步结果表明:这些模型在合成可视化编程任务方面效果不佳,且难以进行逻辑推理与空间推理。我们提出了一种新颖的神经符号技术NeurTaskSyn,该技术能够根据以目标编程概念(由其求解代码体现)和可视化任务约束形式给出的规范,合成编程任务。NeurTaskSyn包含两个核心组件:第一组件通过模仿学习流程训练以生成可能的求解代码,第二组件通过强化学习流程训练以引导底层符号执行引擎生成对应这些代码的可视化任务。我们基于Code-dot-org平台的《编程一小时:经典迷宫》挑战与CodeHS-dot-com平台的《Karel编程入门》课程中的参考任务,通过广泛的实证评估和定性研究验证了NeurTaskSyn的有效性。