Smart contracts are the backbone of the decentralized web, yet ensuring their functional correctness and security remains a critical challenge. While Large Language Models (LLMs) have shown promise in code generation, they often struggle with the rigorous requirements of smart contracts, frequently producing code that is buggy or vulnerable. To address this, we propose SolAgent, a novel tool-augmented multi-agent framework that mimics the workflow of human experts. SolAgent integrates a \textbf{dual-loop refinement mechanism}: an inner loop using the \textit{Forge} compiler to ensure functional correctness, and an outer loop leveraging the \textit{Slither} static analyzer to eliminate security vulnerabilities. Additionally, the agent is equipped with file system capabilities to resolve complex project dependencies. Experiments on the SolEval+ Benchmark, a rigorous suite derived from high-quality real-world projects, demonstrate that SolAgent achieves a Pass@1 rate of up to \textbf{64.39\%}, significantly outperforming state-of-the-art LLMs ($\sim$25\%), AI IDEs (e.g., GitHub Copilot), and existing agent frameworks. Moreover, it reduces security vulnerabilities by up to \textbf{39.77\%} compared to human-written baselines. Finally, we demonstrate that the high-quality trajectories generated by SolAgent can be used to distill smaller, open-source models, democratizing access to secure smart contract generation. We release our data and code at https://github.com/openpaperz/SolAgent.
翻译:智能合约是去中心化网络的基石,然而确保其功能正确性与安全性仍是一项关键挑战。尽管大语言模型在代码生成方面展现出潜力,但它们往往难以满足智能合约的严苛要求,生成的代码常存在缺陷或漏洞。为此,我们提出了SolAgent,一个新颖的工具增强型多智能体框架,它模拟了人类专家的工作流程。SolAgent集成了一个**双循环精炼机制**:内循环使用 _Forge_ 编译器确保功能正确性,外循环利用 _Slither_ 静态分析器消除安全漏洞。此外,该智能体具备文件系统能力,以解决复杂的项目依赖关系。在SolEval+基准测试(一套源自高质量真实项目的严格测试集)上的实验表明,SolAgent的Pass@1率最高可达**64.39%**,显著优于最先进的大语言模型(约25%)、AI集成开发环境(例如GitHub Copilot)以及现有的智能体框架。同时,相较于人工编写的基线,它能将安全漏洞减少高达**39.77%**。最后,我们证明了SolAgent生成的高质量轨迹可用于蒸馏更小规模的开源模型,从而普及安全智能合约的生成。我们在https://github.com/openpaperz/SolAgent发布了数据与代码。