Code review is a crucial but often complex, subjective, and time-consuming activity in software development. Over the past decades, significant efforts have been made to automate this process. Early approaches focused on knowledge-based systems (KBS) that apply rule-based mechanisms to detect code issues, providing precise feedback but struggling with complex, context-dependent cases. More recent work has shifted toward fine-tuning pre-trained language models for code review, enabling broader issue coverage but often at the expense of precision. In this paper, we propose a hybrid approach that combines the strengths of KBS and learning-based systems (LBS) to generate high-quality, comprehensive code reviews. Our method integrates knowledge at three distinct stages of the language model pipeline: during data preparation (Data-Augmented Training, DAT), at inference (Retrieval-Augmented Generation, RAG), and after inference (Naive Concatenation of Outputs, NCO). We empirically evaluate our combination strategies against standalone KBS and LBS fine-tuned on a real-world dataset. Our results show that these hybrid strategies enhance the relevance, completeness, and overall quality of review comments, effectively bridging the gap between rule-based tools and deep learning models.
翻译:代码审查是软件开发中至关重要但通常复杂、主观且耗时的活动。过去数十年间,研究者为自动化该过程付出了巨大努力。早期方法主要关注基于知识的系统,其应用基于规则的机制检测代码问题,能提供精确反馈但难以处理复杂且依赖上下文的情况。近期研究则转向对预训练语言模型进行微调以实现代码审查,虽然能覆盖更广泛的问题类型,但往往以牺牲精确性为代价。本文提出一种混合方法,结合KBS与学习型系统的优势以生成高质量、全面的代码审查。我们的方法在语言模型流水线的三个不同阶段整合知识:数据准备阶段(数据增强训练)、推理阶段(检索增强生成)以及推理后阶段(输出朴素拼接)。我们在真实数据集上通过实证评估,将组合策略与独立的KBS及微调LBS进行对比。结果表明,这些混合策略能有效提升审查评论的相关性、完整性和整体质量,成功弥合了基于规则的工具与深度学习模型之间的鸿沟。