Agentic code generation requires large language models (LLMs) capable of complex context management and multi-step reasoning. Prior multi-agent frameworks attempt to address these challenges through collaboration, yet they often suffer from rigid workflows and high reasoning recovery costs. To overcome these limitations, we propose TALM (Tree-Structured Multi-Agent Framework with Long-Term Memory), a dynamic framework that integrates structured task decomposition, localized re-reasoning, and long-term memory mechanisms. TALM employs an extensible tree-based collaboration structure. The parent-child relationships, when combined with a divide-and-conquer strategy, enhance reasoning flexibility and enable efficient error correction across diverse task scopes. Furthermore, a long-term memory module enables semantic querying and integration of prior knowledge, supporting implicit self-improvement through experience reuse. Experimental results on HumanEval, BigCodeBench, and ClassEval benchmarks demonstrate that TALM consistently delivers strong reasoning performance and high token efficiency, highlighting its robustness and practical utility in complex code generation tasks.
翻译:智能体化的代码生成需要大型语言模型(LLMs)具备复杂的上下文管理和多步推理能力。现有的多智能体框架试图通过协作应对这些挑战,但它们通常面临工作流程僵化和推理恢复成本高的问题。为克服这些限制,我们提出了TALM(具有长期记忆的树状结构多智能体框架),这是一个动态框架,集成了结构化任务分解、局部重推理和长期记忆机制。TALM采用可扩展的树状协作结构。父子关系与分治策略相结合,增强了推理的灵活性,并能在不同任务范围内实现高效的错误纠正。此外,长期记忆模块支持对先前知识进行语义查询和整合,通过经验复用促进隐式的自我改进。在HumanEval、BigCodeBench和ClassEval基准测试上的实验结果表明,TALM始终展现出强大的推理性能和高令牌效率,突显了其在复杂代码生成任务中的鲁棒性和实用价值。