Large language models are increasingly used for code generation, but many generated programs fail to compile, a prerequisite for further correctness checks such as unit tests. Existing solutions for repairing static errors are costly in both latency and token consumption. Post-hoc repair delays error detection until generation completes and commonly regenerates large regions of previously valid code. Constrained semantic decoding checks after each token, incurring per-token overhead while limiting repair to the current token even when the root cause lies earlier. We present Hydra, a system for efficient recovery from static errors during code generation. Hydra allows checking to proceed asynchronously with generation, avoiding checker overhead when the generated code is semantically correct. In addition, it provides checkpoint-and-rollback support for targeted repair, avoiding regeneration and rechecking of valid prefixes. We retrofit the Clang C/C++ compiler to support Hydra with modest modifications. Paired with a token-efficient repair strategy, Hydra reduces latency by up to 71% and token consumption by up to 70% relative to post-hoc repair on C/C++ code generation tasks that encounter static errors.
翻译:大语言模型越来越多地被用于代码生成,但许多生成的程序无法通过编译,而编译是通过单元测试等进一步正确性检查的先决条件。现有修复静态错误的方案在延迟和令牌消耗方面成本高昂。事后修复(Post-hoc repair)将错误检测延迟至生成完成后,且通常需重新生成大量先前有效的代码。约束性语义解码在每个令牌后进行检查,引发每令牌开销,同时即使根本原因出现在更早位置,也仅限制修复当前令牌。我们提出Hydra系统,用于在代码生成过程中高效恢复静态错误。Hydra允许检查与生成异步进行,从而在生成代码语义正确时避免检查器开销。此外,它提供针对性的检查点与回滚(checkpoint-and-rollback)支持,避免重新生成和重新检查有效前缀。我们通过适度修改,为Clang C/C++编译器适配Hydra。结合令牌高效的修复策略,在遇到静态错误的C/C++代码生成任务中,相比事后修复,Hydra将延迟降低高达71%,令牌消耗降低高达70%。