Agentic code generation requires large language models (LLMs) capable of complex context management and multi-step reasoning. Prior multi-agent frameworks attempt to address these challenges through collaboration, yet they often suffer from rigid workflows and high reasoning recovery costs. To overcome these limitations, we propose TALM (Tree-Structured Multi-Agent Framework with Long-Term Memory), a dynamic framework that integrates structured task decomposition, localized re-reasoning, and long-term memory mechanisms. TALM employs an extensible tree-based collaboration structure. The parent-child relationships, when combined with a divide-and-conquer strategy, enhance reasoning flexibility and enable efficient error correction across diverse task scopes. Furthermore, a long-term memory module enables semantic querying and integration of prior knowledge, supporting implicit self-improvement through experience reuse. Experimental results on HumanEval, BigCodeBench, and ClassEval benchmarks demonstrate that TALM consistently delivers strong reasoning performance and high token efficiency, highlighting its robustness and practical utility in complex code generation tasks.
翻译:智能体代码生成要求大型语言模型(LLM)具备复杂的上下文管理和多步推理能力。现有的多智能体框架试图通过协作应对这些挑战,但它们通常面临工作流程僵化及推理恢复成本高昂的问题。为克服这些局限,我们提出了TALM(具有长期记忆的树状多智能体框架),这是一个集成了结构化任务分解、局部重推理和长期记忆机制的动态框架。TALM采用一种可扩展的基于树的协作结构。父子关系与分治策略相结合,增强了推理的灵活性,并能在不同任务范围内实现高效的错误修正。此外,长期记忆模块支持对先验知识进行语义查询与整合,通过经验复用实现隐式的自我改进。在HumanEval、BigCodeBench和ClassEval基准测试上的实验结果表明,TALM能够持续提供强大的推理性能和高令牌效率,突显了其在复杂代码生成任务中的鲁棒性和实用价值。