无需测试用例的自动修复：LLM如何修复大型工业嵌入式代码中的编译错误 (Auto-repair without test cases: How LLMs fix compilation errors in large industrial embedded code)

The co-development of hardware and software in industrial embedded systems frequently leads to compilation errors during continuous integration (CI). Automated repair of such failures is promising, but existing techniques rely on test cases, which are not available for non-compilable code. We employ an automated repair approach for compilation errors driven by large language models (LLMs). Our study encompasses the collection of more than 40000 commits from the product's source code. We assess the performance of an industrial CI system enhanced by four state-of-the-art LLMs, comparing their outcomes with manual corrections provided by human programmers. LLM-equipped CI systems can resolve up to 63 % of the compilation errors in our baseline dataset. Among the fixes associated with successful CI builds, 83 % are deemed reasonable. Moreover, LLMs significantly reduce debugging time, with the majority of successful cases completed within 8 minutes, compared to hours typically required for manual debugging.

翻译：在工业嵌入式系统中，硬件与软件的协同开发常常在持续集成（CI）过程中引发编译错误。对此类故障进行自动修复具有广阔前景，但现有技术依赖于测试用例，而无法编译的代码通常不具备测试用例。我们采用了一种由大型语言模型（LLM）驱动的编译错误自动修复方法。本研究收集了该产品源代码中超过40000次提交记录。我们评估了由四种前沿LLM增强的工业CI系统的性能，并将其修复结果与程序员提供的人工修正进行了对比。配备LLM的CI系统能够解决我们基准数据集中高达63%的编译错误。在成功通过CI构建的修复案例中，83%的修复方案被认为是合理的。此外，LLM显著缩短了调试时间：多数成功案例在8分钟内即可完成，而传统人工调试通常需要数小时。