执行引导的逐行代码生成 (Execution Guided Line-by-Line Code Generation)

We present a novel approach to neural code generation that incorporates real-time execution signals into the language model generation process. While large language models (LLMs) have demonstrated impressive code generation capabilities, they typically do not utilize execution feedback during inference, a critical signal that human programmers regularly leverage. Our method, Execution-Guided Classifier-Free Guidance (EG-CFG), dynamically incorporates execution signals as the model generates code, providing line-by-line feedback that guides the generation process toward executable solutions. EG-CFG employs a multi-stage process: first, we conduct beam search to sample candidate program completions for each line; second, we extract execution signals by executing these candidates against test cases; and finally, we incorporate these signals into the prompt during generation. By maintaining consistent signals across tokens within the same line and refreshing signals at line boundaries, our approach provides coherent guidance while preserving syntactic structure. Moreover, the method naturally supports native parallelism at the task level in which multiple agents operate in parallel, exploring diverse reasoning paths and collectively generating a broad set of candidate solutions. Our experiments across diverse coding tasks demonstrate that EG-CFG significantly improves code generation performance compared to standard approaches, achieving state-of-the-art results across various levels of complexity, from foundational problems to challenging competitive programming and data science tasks. Our code is available at: https://github.com/boazlavon/eg_cfg

翻译：我们提出了一种新颖的神经代码生成方法，该方法将实时执行信号融入语言模型的生成过程。尽管大型语言模型（LLMs）已展现出令人印象深刻的代码生成能力，但它们通常在推理过程中不利用执行反馈，而这是人类程序员经常依赖的关键信号。我们的方法——执行引导的无分类器指导（EG-CFG）——在模型生成代码时动态地整合执行信号，提供逐行反馈，引导生成过程朝向可执行的解决方案。EG-CFG采用多阶段流程：首先，我们对每一行进行束搜索以采样候选程序补全；其次，我们通过针对测试用例执行这些候选代码来提取执行信号；最后，我们在生成过程中将这些信号整合到提示中。通过在同一行内的标记间保持一致的信号，并在行边界处刷新信号，我们的方法在保持语法结构的同时提供了连贯的引导。此外，该方法天然支持任务级的原生并行性，其中多个智能体并行操作，探索多样化的推理路径，并共同生成广泛的候选解决方案集。我们在多样化编码任务上的实验表明，与标准方法相比，EG-CFG显著提升了代码生成性能，在从基础问题到具有挑战性的竞赛编程和数据科学任务等不同复杂度级别上均取得了最先进的结果。我们的代码发布于：https://github.com/boazlavon/eg_cfg