AI agent frameworks operate in isolation, forcing agents to rediscover solutions and repeat mistakes across different systems. Despite valuable problem-solving experiences accumulated by frameworks like smolagents, OpenHands, and OWL, this knowledge remains trapped within individual systems, preventing the emergence of collective intelligence. Current memory systems focus on individual agents or framework-specific demonstrations, failing to enable cross-architecture knowledge transfer. We introduce AGENT KB, a universal memory infrastructure enabling seamless experience sharing across heterogeneous agent frameworks without retraining. AGENT KB aggregates trajectories into a structured knowledge base and serves lightweight APIs. At inference time, hybrid retrieval operates through two stages: planning seeds agents with cross-domain workflows, while feedback applies targeted diagnostic fixes. A disagreement gate ensures retrieved knowledge enhances rather than disrupts reasoning, addressing knowledge interference in cross-framework transfer. We validate AGENT KB across major frameworks on GAIA, Humanity's Last Exam, GPQA, and SWE-bench. Results show substantial improvements across diverse model families: compared to baseline pass@1, smolagents with AGENT KB achieve up to 18.7pp gains at pass@3 (55.2% -> 73.9%), while OpenHands improves 4.0pp on SWE-bench pass@1 (24.3% -> 28.3%). Similar improvements are observed across all base model families. Ablations confirm that hybrid retrieval and feedback stages are essential, with automatically generated experiences matching manual curation. This establishes the foundation for collective agent intelligence through shared memory infrastructures.
翻译:现有AI智能体框架各自为营,导致智能体在不同系统中重复探索解决方案并重蹈覆辙。尽管smolagents、OpenHands、OWL等框架已积累宝贵的问题求解经验,这些知识仍被禁锢在独立系统中,阻碍了集体智能的形成。当前记忆系统仅关注单个智能体或框架特定的演示案例,无法实现跨架构知识迁移。本文提出AGENT KB——一种通用记忆基础设施,无需重新训练即可实现异构智能体框架间的无缝经验共享。AGENT KB将执行轨迹聚合为结构化知识库,并提供轻量级API服务。在推理阶段,混合检索通过两阶段机制运作:规划阶段通过跨领域工作流为智能体提供种子方案,反馈阶段则实施针对性诊断修复。分歧门控机制确保检索知识能够增强而非干扰推理过程,从而解决跨框架迁移中的知识干扰问题。我们在GAIA、Humanity's Last Exam、GPQA和SWE-bench基准上对主流框架进行了验证。结果表明,该方法在不同模型家族中均取得显著提升:相较于基线pass@1,集成AGENT KB的smolagents在pass@3指标上最高提升18.7个百分点(55.2%→73.9%),OpenHands在SWE-bench pass@1指标上提升4.0个百分点(24.3%→28.3%)。所有基础模型家族均呈现相似改进趋势。消融实验证实混合检索与反馈阶段不可或缺,且自动生成经验与人工标注效果相当。本研究通过共享记忆基础设施为构建集体智能体智能奠定了理论基础。