面向控制的代码演化：基于LLM驱动进化搜索的策略合成 (Code Evolution for Control: Synthesizing Policies via LLM-Driven Evolutionary Search)

Designing effective control policies for autonomous systems remains a fundamental challenge, traditionally addressed through reinforcement learning or manual engineering. While reinforcement learning has achieved remarkable success, it often suffers from high sample complexity, reward shaping difficulties, and produces opaque neural network policies that are hard to interpret or verify. Manual design, on the other hand, requires substantial domain expertise and struggles to scale across diverse tasks. In this work, we demonstrate that LLM-driven evolutionary search can effectively synthesize interpretable control policies in the form of executable code. By treating policy synthesis as a code evolution problem, we harness the LLM's prior knowledge of programming patterns and control heuristics while employing evolutionary search to explore the solution space systematically. We implement our approach using EvoToolkit, a framework that seamlessly integrates LLM-driven evolution with customizable fitness evaluation. Our method iteratively evolves populations of candidate policy programs, evaluating them against task-specific objectives and selecting superior individuals for reproduction. This process yields compact, human-readable control policies that can be directly inspected, modified, and formally verified. This work highlights the potential of combining foundation models with evolutionary computation for synthesizing trustworthy control policies in autonomous systems. Code is available at https://github.com/pgg3/EvoControl.

翻译：为自主系统设计有效的控制策略一直是一个基础性挑战，传统上通过强化学习或人工工程方法解决。尽管强化学习已取得显著成功，但它通常存在样本复杂度高、奖励函数设计困难等问题，且产生的神经网络策略不透明，难以解释或验证。另一方面，人工设计需要大量领域专业知识，且难以在不同任务间扩展。本工作证明，LLM驱动的进化搜索能够以可执行代码的形式有效合成可解释的控制策略。通过将策略合成视为代码演化问题，我们利用LLM对编程模式和控制启发式方法的先验知识，同时采用进化搜索系统性地探索解空间。我们使用EvoToolkit框架实现该方法，该框架将LLM驱动的进化与可定制的适应度评估无缝集成。我们的方法迭代演化候选策略程序种群，根据任务特定目标评估它们，并选择优异个体进行繁殖。该过程产生紧凑、人类可读的控制策略，可直接检查、修改和形式化验证。本工作凸显了将基础模型与进化计算相结合，为自主系统合成可信控制策略的潜力。代码发布于 https://github.com/pgg3/EvoControl。