A semantic gap separates how users describe tasks from how tools are documented. As API ecosystems scale to tens of thousands of endpoints, static retrieval from the initial query alone cannot bridge this gap: the agent's understanding of what it needs evolves during execution, but its tool set does not. We introduce FitText, a training-free framework that makes retrieval dynamic by embedding it directly in the agent's reasoning loop. FitText generates natural-language pseudo-tool descriptions as retrieval probes, refines them iteratively using retrieval feedback, and explores diverse alternatives through stochastic generation. Memetic Retrieval adds evolutionary selection pressure over candidate descriptions, guided by a tool memory that avoids redundant search. On ToolRet (43k tools, 4 domains), FitText improves average retrieval rank from 8.81 to 2.78; on StableToolBench (16,464 APIs), it achieves a 0.73 average pass rate--a 24-point absolute gain over static query retrieval. The gains transfer across base models capable of acting as competent semantic operators; under weaker base models, Memetic's evolutionary search inverts--amplifying noise rather than refining signal--surfacing model capacity as a prerequisite for evolutionary tool exploration.
翻译:用户描述任务的方式与工具文档之间存在语义鸿沟。随着API生态扩展至数万个端点,仅凭初始查询的静态检索无法弥合这一鸿沟:智能体在执行过程中对自身需求的理解会不断演化,但其工具集却保持静态不变。我们提出FitText,一种无需训练的框架,通过将检索直接嵌入智能体的推理循环中实现动态化。FitText生成自然语言伪工具描述作为检索探针,利用检索反馈进行迭代优化,并通过随机生成探索多样化候选方案。模因检索通过演化选择压力对候选描述进行筛选,并由避免冗余搜索的工具记忆机制加以引导。在ToolRet(4.3万个工具,4个领域)上,FitText将平均检索排名从8.81提升至2.78;在StableToolBench(16,464个API)上,其平均通过率达0.73——较静态查询检索实现24个百分点的绝对提升。该性能增益可迁移至具备语义算子能力的各类基础模型;在较弱的基模型下,模因机制的演化搜索会出现反转——放大噪声而非优化信号——由此揭示模型能力是演化工具探索的前提条件。