SEVerA: Verified Synthesis of Self-Evolving Agents

Recent advances have shown the effectiveness of self-evolving LLM agents on tasks such as program repair and scientific discovery. In this paradigm, a planner LLM synthesizes an agent program that invokes parametric models, including LLMs, which are then tuned per task to improve performance. However, existing self-evolving agent frameworks provide no formal guarantees of safety or correctness. Because such programs are often executed autonomously on unseen inputs, this lack of guarantees raises reliability and security concerns. We formulate agentic code generation as a constrained learning problem, combining hard formal specifications with soft objectives capturing task utility. We introduce Formally Guarded Generative Models (FGGM), which allow the planner LLM to specify a formal output contract for each generative model call using first-order logic. Each FGGM call wraps the underlying model in a rejection sampler with a verified fallback, ensuring every returned output satisfies the contract for any input and parameter setting. Building on FGGM, we present SEVerA (Self-Evolving Verified Agents), a three-stage framework: Search synthesizes candidate parametric programs containing FGGM calls; Verification proves correctness with respect to hard constraints for all parameter values, reducing the problem to unconstrained learning; and Learning applies scalable gradient-based optimization, including GRPO-style fine-tuning, to improve the soft objective while preserving correctness. We evaluate SEVerA on Dafny program verification, symbolic math synthesis, and policy-compliant agentic tool use ($τ^2$-bench). Across tasks, SEVerA achieves zero constraint violations while improving performance over unconstrained and SOTA baselines, showing that formal behavioral constraints not only guarantee correctness but also steer synthesis toward higher-quality agents.

翻译：近期研究进展表明，自演进大语言模型（LLM）智能体在程序修复与科学发现等任务中展现出显著有效性。在该范式下，规划型LLM合成可调用参数化模型（包括LLM）的智能体程序，随后针对具体任务对模型进行微调以提升性能。然而，现有自演进智能体框架缺乏形式化的安全性与正确性保证。由于此类程序通常需在未见过输入上自主执行，这种保证缺失引发了可靠性与安全隐患。本文将智能体代码生成建模为约束学习问题，将硬性形式规范与刻画任务效用的软性目标相结合。我们提出形式化防护生成模型（Formally Guarded Generative Models, FGGM），该模型允许规划型LLM通过一阶逻辑为每次生成模型调用指定形式化输出合约。每个FGGM调用通过带验证回退机制的拒绝采样器封装底层模型，确保任意输入与参数设置下的返回输出均满足合约约束。基于FGGM，我们提出SEVerA（Self-Evolving Verified Agents，自演进验证智能体）三阶段框架：搜索阶段合成包含FGGM调用的候选参数化程序；验证阶段证明程序在所有参数取值下对硬约束的正确性，将问题转化为无约束学习；学习阶段采用可扩展的梯度优化方法（包括GRPO风格微调）在保证正确性的同时优化软目标。我们在Dafny程序验证、符号数学合成及策略合规智能体工具使用（τ²-bench）三个任务上评估SEVerA。实验表明，SEVerA在保持零约束违反的同时，性能全面超越无约束基线与当前最优方法，证明形式化行为约束不仅能保证正确性，更能引导合成出更高质量的智能体。