Transformer-based large language models (LLMs) are constrained by the fixed context window of the underlying transformer architecture, hindering their ability to produce long and logically consistent code. Memory-augmented LLMs are a promising solution, but current approaches cannot handle long code generation tasks since they (1) only focus on reading memory and reduce its evolution to the concatenation of new memories or (2) use very specialized memories that cannot adapt to other domains. This paper presents L2MAC, the first practical LLM-based stored-program automatic computer for long and consistent code generation. Its memory has two components: the instruction registry, which is populated with a prompt program to solve the user-given task, and a file store, which will contain the final and intermediate outputs. Each instruction is executed by a separate LLM instance, whose context is managed by a control unit capable of precise memory reading and writing to ensure effective interaction with the file store. These components enable L2MAC to generate virtually unbounded code structures, bypassing the constraints of the finite context window while producing code that fulfills complex user-specified requirements. We empirically show that L2MAC succeeds in generating large code bases for system design tasks where other coding methods fall short in implementing user requirements and provide insight into the reasons for this performance gap.
翻译:基于Transformer的大型语言模型(LLMs)受限于底层Transformer架构的固定上下文窗口,难以生成逻辑一致的长代码。记忆增强型语言模型是一种有前景的解决方案,但现有方法无法处理长代码生成任务,原因在于:(1)它们仅关注记忆读取,将记忆演化简化为新记忆的拼接;或(2)使用高度专门化的记忆,难以适应其他领域。本文提出L2MAC,这是首个基于LLM的实用存储程序自动计算机,专为生成逻辑一致的长代码而设计。其记忆包含两个组件:指令寄存器,用于存储解决用户给定任务的提示程序;文件存储,用于保存最终和中间输出。每条指令由独立的LLM实例执行,其上下文由控制单元管理,该控制单元能够精确读写记忆,确保与文件存储的有效交互。这些组件使L2MAC能够生成几乎无界的代码结构,突破有限上下文窗口的限制,同时生成满足复杂用户需求的代码。实验结果表明,L2MAC能成功生成系统设计任务的大型代码库,而其他编码方法在此类任务中难以实现用户需求,并深入分析了导致这一性能差异的原因。