ROCODE：在大型语言模型代码生成中集成回溯机制与程序分析 (ROCODE: Integrating Backtracking Mechanism and Program Analysis in Large Language Models for Code Generation)

Large language models (LLMs) have achieved impressive performance in code generation recently, offering programmers revolutionary assistance in software development. However, due to the auto-regressive nature of LLMs, they are susceptible to error accumulation during code generation. Once an error is produced, LLMs can merely continue to generate the subsequent code conditioned on it, given their inability to adjust previous outputs. Existing LLM-based approaches typically consider post-revising after code generation, leading to the challenging resolution of accumulated errors and the significant wastage of resources. Ideally, LLMs should rollback and resolve the occurred error in time during code generation, rather than proceed on the basis of the error and wait for post-revising after generation. In this paper, we propose ROCODE, which integrates the backtracking mechanism and program analysis into LLMs for code generation. Specifically, we employ program analysis to perform incremental error detection during the generation process. When an error is detected, the backtracking mechanism is triggered to priming rollback strategies and constraint regeneration, thereby eliminating the error early and ensuring continued generation on the correct basis. Experiments on multiple code generation benchmarks show that ROCODE can significantly reduce the errors generated by LLMs, with a compilation pass rate of 99.1%. The test pass rate is improved by up to 23.8% compared to the best baseline approach. Compared to the post-revising baseline, the token cost is reduced by 19.3%. Moreover, our approach is model-agnostic and achieves consistent improvements across nine representative LLMs.

翻译：大型语言模型（LLMs）近年来在代码生成方面取得了令人瞩目的性能，为软件开发人员提供了革命性的辅助。然而，由于LLMs的自回归特性，它们在代码生成过程中容易受到错误累积的影响。一旦产生错误，LLMs仅能基于该错误继续生成后续代码，因其无法调整先前的输出。现有的基于LLM的方法通常在代码生成后进行后修正，这导致累积错误的解决极具挑战性，并造成资源的显著浪费。理想情况下，LLMs应在代码生成过程中及时回滚并解决已发生的错误，而非基于错误继续生成并等待生成后的修正。本文提出ROCODE，将回溯机制与程序分析集成到LLMs中用于代码生成。具体而言，我们采用程序分析在生成过程中执行增量错误检测。当检测到错误时，触发回溯机制以启动回滚策略和约束重新生成，从而及早消除错误并确保在正确基础上继续生成。在多个代码生成基准测试上的实验表明，ROCODE能显著减少LLMs生成的错误，编译通过率达到99.1%。与最佳基线方法相比，测试通过率最高提升23.8%。相较于后修正基线，令牌成本降低19.3%。此外，我们的方法具有模型无关性，在九个代表性LLMs上均取得了一致的性能提升。