Large Language Models cannot reliably acquire new knowledge post-deployment -- even when relevant text resources exist, models fail to transform them into actionable knowledge without retraining. Retrieval-Augmented Generation attempts to bridge this gap by surfacing relevant documents at inference time, yet similarity-based retrieval often fails to identify context that actually improves task performance. We introduce Evolutionary Context Search (ECS), an evolutionary method that searches context combinations using accuracy on a small development set, requiring only inference calls without weight updates. ECS moves beyond semantic similarity to discover non-obvious context pairings that significantly boost performance. Our empirical results show that ECS improves BackendBench by 27\% and $τ$-bench airline by 7\%. The evolved contexts are model-agnostic, as those evolved with Gemini-3-Flash transfer effectively to Claude Sonnet and DeepSeek. This suggests that ECS opens a path toward automated context discovery for skill acquisition -- an efficient alternative to manual prompt engineering or costly fine-tuning.
翻译:大型语言模型在部署后无法可靠地获取新知识——即使存在相关文本资源,模型也无法在不重新训练的情况下将其转化为可操作的知识。检索增强生成试图通过在推理时呈现相关文档来弥合这一差距,但基于相似性的检索往往无法识别真正提升任务性能的上下文。我们提出了进化上下文搜索(ECS),这是一种进化方法,利用小型开发集上的准确率来搜索上下文组合,仅需推理调用而无需权重更新。ECS超越了语义相似性,能够发现显著提升性能的非明显上下文配对。我们的实证结果表明,ECS将BackendBench的性能提升了27%,将$τ$-bench airline的性能提升了7%。进化出的上下文具有模型无关性,因为使用Gemini-3-Flash进化出的上下文能有效迁移至Claude Sonnet和DeepSeek。这表明ECS为自动化上下文发现开辟了一条通向技能获取的路径——成为手动提示工程或成本高昂的微调的有效替代方案。