Modern fuzzers scale to large, real-world software but often fail to exercise the program states developers consider most fragile or security-critical. Such states are typically deep in the execution space, gated by preconditions, or overshadowed by lower-value paths that consume limited fuzzing budgets. Meanwhile, developers routinely surface risk-relevant insights during code review, yet this information is largely ignored by automated testing tools. We present EyeQ, a system that leverages developer intelligence from code reviews to guide fuzzing. EyeQ extracts security-relevant signals from review discussions, localizes the implicated program regions, and translates these insights into annotation-based guidance for fuzzing. The approach operates atop existing annotation-aware fuzzing, requiring no changes to program semantics or developer workflows. We first validate EyeQ through a human-guided feasibility study on a security-focused dataset of PHP code reviews, establishing a strong baseline for review-guided fuzzing. We then automate the workflow using a large language model with carefully designed prompts. EyeQ significantly improves vulnerability discovery over standard fuzzing configurations, uncovering more than 40 previously unknown bugs in the security-critical PHP codebase.
翻译:现代模糊测试工具能够扩展到大规模的实际软件,但在执行开发者认为最脆弱或最安全关键的程序状态时往往表现不佳。这些状态通常深藏于执行空间中,受前置条件限制,或被消耗有限模糊测试预算的低价值路径所掩盖。与此同时,开发者在代码审查过程中会定期揭示风险相关的见解,但这些信息在很大程度上被自动化测试工具所忽视。我们提出了EyeQ系统,该系统利用代码审查中的开发者智能来引导模糊测试。EyeQ从审查讨论中提取安全相关信号,定位所涉及的程序区域,并将这些见解转化为基于注解的模糊测试引导。该方法建立在现有支持注解的模糊测试框架之上,无需改变程序语义或开发者工作流程。我们首先通过针对PHP代码审查安全数据集进行人工引导的可行性研究验证EyeQ,为审查引导的模糊测试建立了坚实的基线。随后,我们使用精心设计提示的大语言模型实现了工作流程的自动化。与标准模糊测试配置相比,EyeQ显著提升了漏洞发现能力,在安全关键的PHP代码库中发现了超过40个先前未知的漏洞。