This paper presents a method to automatically fix implicit data loss warnings in large C++ projects using Large Language Models (LLMs). Our approach uses the Language Server Protocol (LSP) to gather context, Tree-sitter to extract relevant code, and LLMs to make decisions and generate fixes. The method evaluates the necessity of range checks concerning performance implications and generates appropriate fixes. We tested this method in a large C++ project, resulting in a 92.73% acceptance rate of the fixes by human developers during the code review. Our LLM-generated fixes reduced the number of warning fix changes that introduced additional instructions due to range checks and exception handling by 39.09% compared to a baseline fix strategy. This result was 13.56% behind the optimal solutions created by human developers. These findings demonstrate that our LLM-based approach can reduce the manual effort to address compiler warnings while maintaining code quality and performance in a real-world scenario. Our automated approach shows promise for integration into existing development workflows, potentially improving code maintenance practices in complex C++ software projects.
翻译:本文提出了一种利用大型语言模型(LLM)自动修复大型C++项目中隐式数据丢失警告的方法。我们的方法使用语言服务器协议(LSP)收集上下文,利用Tree-sitter提取相关代码,并借助LLM进行决策和生成修复方案。该方法评估了涉及性能影响的边界检查的必要性,并生成相应的修复。我们在一个大型C++项目中测试了该方法,结果显示在代码审查过程中,人工开发人员对修复方案的接受率达到92.73%。与基线修复策略相比,我们基于LLM生成的修复方案将因边界检查和异常处理而引入额外指令的警告修复变更数量减少了39.09%。这一结果比人工开发者创建的最优解决方案落后13.56%。这些发现表明,我们的基于LLM的方法能够在真实场景中减少解决编译器警告所需的手动工作量,同时保持代码质量和性能。我们的自动化方法显示出集成到现有开发工作流程中的潜力,有望改善复杂C++软件项目中的代码维护实践。