SEVerA: Verified Synthesis of Self-Evolving Agents

Recent advances have shown the effectiveness of self-evolving LLM agents on tasks such as program repair and scientific discovery. In this paradigm, a planner LLM synthesizes an agent program that invokes parametric models, including LLMs, which are then tuned per task to improve performance. However, existing self-evolving agent frameworks provide no formal guarantees of safety or correctness. Because such programs are often executed autonomously on unseen inputs, this lack of guarantees raises reliability and security concerns. We formulate agentic code generation as a constrained learning problem, combining hard formal specifications with soft objectives capturing task utility. We introduce Formally Guarded Generative Models (FGGM), which allow the planner LLM to specify a formal output contract for each generative model call using first-order logic. Each FGGM call wraps the underlying model in a rejection sampler with a verified fallback, ensuring every returned output satisfies the contract for any input and parameter setting. Building on FGGM, we present SEVerA (Self-Evolving Verified Agents), a three-stage framework: Search synthesizes candidate parametric programs containing FGGM calls; Verification proves correctness with respect to hard constraints for all parameter values, reducing the problem to unconstrained learning; and Learning applies scalable gradient-based optimization, including GRPO-style fine-tuning, to improve the soft objective while preserving correctness. We evaluate SEVerA on Dafny program verification, symbolic math synthesis, and policy-compliant agentic tool use ($τ^2$-bench). Across tasks, SEVerA achieves zero constraint violations while improving performance over unconstrained and SOTA baselines, showing that formal behavioral constraints not only guarantee correctness but also steer synthesis toward higher-quality agents.

翻译：近期进展表明，自我演化的大语言模型智能体在程序修复和科学发现等任务上具有显著效果。在该范式中，规划器大语言模型合成一个调用参数化模型（包括大语言模型）的智能体程序，随后每个任务对这些模型进行微调以提升性能。然而，现有的自我演化智能体框架未提供任何关于安全性或正确性的形式化保证。由于此类程序常在不可见输入上自主执行，这种保证缺失引发了可靠性与安全性担忧。我们将智能体代码生成形式化为一个约束学习问题，结合硬形式化规约与捕捉任务效用的软目标。我们引入形式化守卫生成模型（FGGM），该模型允许规划器大语言模型使用一阶逻辑为每个生成模型调用指定形式化输出契约。每个FGGM调用将底层模型封装在带有已验证后备方案的回绝采样器中，确保对于任意输入和参数设置，每个返回输出均满足契约。基于FGGM，我们提出SEVerA（自我演化可验证智能体），这是一个三阶段框架：搜索阶段合成包含FGGM调用的候选参数化程序；验证阶段证明这些程序在所有参数值下均满足硬约束的正确性，从而将问题简化为无约束学习；学习阶段应用可扩展的基于梯度的优化（包括GRPO风格微调），在保持正确性的同时改进软目标。我们在Dafny程序验证、符号数学综合以及策略合规的智能体工具使用（τ²基准）上评估SEVerA。在各项任务中，SEVerA在保持零约束违规的同时，性能优于无约束及最先进基线，表明形式化行为约束不仅能保证正确性，还能引导合成过程生成更高质量的智能体。