Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks. However, their performance relies on massive data and computation. We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training. As a result, training on a new task may deteriorate the model's performance on previous tasks. In contrast to LLMs' implicit memory mechanism, the human brain utilizes distributed memory storage, which helps manage and organize multiple skills efficiently, mitigating the forgetting phenomenon. Inspired by this, we propose a working memory module to store, blend, and retrieve information for different downstream tasks. Evaluation results show that the proposed method improves training efficiency and generalization in Atari games and Meta-World object manipulation tasks. Moreover, we demonstrate that memory fine-tuning further enhances the adaptability of the proposed architecture.
翻译:基于决策Transformer的决策智能体已展现出跨任务泛化的能力。然而,其性能依赖于海量数据与计算资源。我们认为这种低效性源于遗忘现象——模型在训练过程中将行为模式固化至参数内,导致新任务训练可能损害模型在先前任务上的表现。与大型语言模型的隐式记忆机制不同,人脑采用分布式记忆存储机制,能有效管理并组织多项技能,从而缓解遗忘现象。受此启发,我们提出一种工作记忆模块,用于存储、融合并检索不同下游任务的信息。评估结果表明,该方法在Atari游戏和Meta-World物体操控任务中提升了训练效率与泛化能力。此外,我们通过实验证明记忆微调能进一步增强该架构的适应性。