Mutation-based fuzzing is effective for uncovering compiler bugs, but designing high-quality mutators for modern languages with complex constructs (e.g., templates, macros) remains challenging. Existing methods rely heavily on manual design or human-in-the-loop correction, limiting scalability and cross-language generalizability. We present Mut4All, a fully automated, language-agnostic framework that synthesizes mutators using Large Language Models (LLMs) and compiler-specific knowledge from bug reports. It consists of three agents: (1) a mutator invention agent that identifies mutation targets and generates mutator metadata using compiler-related insights; (2) a mutator implementation synthesis agent, fine-tuned to produce initial implementations; and (3) a mutator refinement agent that verifies and corrects the mutators via unit-test feedback. Mut4All processes 1000 bug reports (500 Rust, 500 C++), yielding 319 Rust and 403 C++ mutators at ~$0.08 each via GPT-4o. Our customized fuzzer, using these mutators, finds 62 bugs in Rust compilers (38 new, 7 fixed) and 34 bugs in C++ compilers (16 new, 1 fixed). Mut4All outperforms existing methods in both unique crash detection and coverage, ranking first on Rust and second on C++.
翻译:基于变异的模糊测试在发现编译器错误方面是有效的,但对于具有复杂结构(例如模板、宏)的现代语言,设计高质量的变异器仍然具有挑战性。现有方法严重依赖人工设计或人机协同修正,限制了其可扩展性和跨语言泛化能力。我们提出了Mut4All,一个完全自动化、与语言无关的框架,它利用大型语言模型(LLMs)和从错误报告中提取的编译器特定知识来合成变异器。该框架包含三个智能体:(1) 一个变异器发明智能体,利用编译器相关知识识别变异目标并生成变异器元数据;(2) 一个变异器实现合成智能体,经过微调以生成初始实现;(3) 一个变异器精炼智能体,通过单元测试反馈验证并修正变异器。Mut4All处理了1000份错误报告(500份Rust,500份C++),通过GPT-4o以每个约0.08美元的成本,生成了319个Rust变异器和403个C++变异器。我们定制的模糊测试器使用这些变异器,在Rust编译器中发现了62个错误(38个新错误,7个已修复),在C++编译器中发现了34个错误(16个新错误,1个已修复)。Mut4All在独特崩溃检测和代码覆盖率方面均优于现有方法,在Rust上排名第一,在C++上排名第二。