KAPSO: A Knowledge-grounded framework for Autonomous Program Synthesis and Optimization

We introduce KAPSO, a modular framework for autonomous program synthesis and optimization. Given a natural language goal and an evaluation method, KAPSO iteratively performs ideation, code synthesis and editing, execution, evaluation, and learning to improve a runnable artifact toward measurable objectives. Rather than treating synthesis as the endpoint, KAPSO uses synthesis as an operator within a long-horizon optimization loop, where progress is defined by evaluator outcomes. KAPSO targets long-horizon failures common in coding agents, including lost experimental state, brittle debugging, and weak reuse of domain expertise, by integrating three tightly coupled components. First, a git-native experimentation engine isolates each attempt as a branch, producing reproducible artifacts and preserving provenance across iterations. Second, a knowledge system ingests heterogeneous sources, including repositories, internal playbooks, and curated external resources such as documentation, scientific papers, and web search results, and organizes them into a structured representation that supports retrieval over workflows, implementations, and environment constraints. Third, a cognitive memory layer coordinates retrieval and maintains an episodic store of reusable lessons distilled from experiment traces (run logs, diffs, and evaluator feedback), reducing repeated error modes and accelerating convergence. We evaluated KAPSO on MLE-Bench (Kaggle-style ML competitions) and ALE-Bench (AtCoder heuristic optimization), and report end-to-end performance. Code Available at: https://github.com/Leeroo-AI/kapso

翻译：本文介绍KAPSO——一个用于自主程序合成与优化的模块化框架。给定自然语言目标与评估方法，KAPSO通过迭代执行构思、代码合成与编辑、执行、评估及学习等步骤，推动可运行工件朝着可量化的目标持续改进。区别于将合成视为终点的传统范式，KAPSO将合成作为长周期优化循环中的算子，其进展完全由评估器结果定义。为应对编程智能体中普遍存在的长周期失效问题（包括实验状态丢失、调试过程脆弱、领域专业知识复用困难），KAPSO整合了三个紧密耦合的核心组件：首先，基于git的实验引擎将每次尝试隔离为独立分支，生成可复现工件并保持迭代过程的溯源记录；其次，知识系统整合异构数据源（包括代码仓库、内部操作手册，以及文档、学术论文、网络搜索结果等经筛选的外部资源），将其组织为支持工作流、实现方案及环境约束检索的结构化表征；最后，认知记忆层协调检索过程并维护从实验轨迹（运行日志、代码差异、评估反馈）中提炼的可复用经验片段存储，从而减少重复错误模式并加速收敛。我们在MLE-Bench（Kaggle式机器学习竞赛）和ALE-Bench（AtCoder启发式优化）基准上评估KAPSO，并报告端到端性能结果。代码已开源：https://github.com/Leeroo-AI/kapso