Automating code review with Large Language Models (LLMs) shows immense promise, yet practical adoption is hampered by their lack of reliability, context-awareness, and control. To address this, we propose Specification-Grounded Code Review (SGCR), a framework that grounds LLMs in human-authored specifications to produce trustworthy and relevant feedback. SGCR features a novel dual-pathway architecture: an explicit path ensures deterministic compliance with predefined rules derived from these specifications, while an implicit path heuristically discovers and verifies issues beyond those rules. Deployed in a live industrial environment at HiThink Research, SGCR's suggestions achieved a 42% developer adoption rate-a 90.9% relative improvement over a baseline LLM (22%). Our work demonstrates that specification-grounding is a powerful paradigm for bridging the gap between the generative power of LLMs and the rigorous reliability demands of software engineering.
翻译:利用大语言模型(LLMs)自动化代码审查展现出巨大潜力,但其可靠性不足、缺乏上下文感知及可控性差等问题阻碍了实际应用。为解决这一难题,我们提出了基于规范的代码审查框架(Specification-Grounded Code Review, SGCR),该框架将大语言模型锚定于人工编写的规范之上,以产生可靠且相关的反馈。SGCR采用一种新颖的双路径架构:显式路径确保对源自规范预定义规则的确定性遵循,而隐式路径则启发式地发现并验证超出这些规则的问题。在HiThink Research的实际工业环境中部署后,SGCR的建议获得了42%的开发者采纳率——相较于基线大语言模型(22%)实现了90.9%的相对提升。我们的工作表明,基于规范锚定是一种强大的范式,能够弥合大语言模型的生成能力与软件工程严格可靠性要求之间的鸿沟。