The increasing complexity of tasks in robotics demands efficient strategies for multitask and continual learning. Traditional models typically rely on a universal policy for all tasks, facing challenges such as high computational costs and catastrophic forgetting when learning new tasks. To address these issues, we introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP). By adopting Mixture of Experts (MoE) within a transformer-based diffusion policy, SDP selectively activates experts and skills, enabling efficient and task-specific learning without retraining the entire model. SDP not only reduces the burden of active parameters but also facilitates the seamless integration and reuse of experts across various tasks. Extensive experiments on diverse tasks in both simulations and real world show that SDP 1) excels in multitask scenarios with negligible increases in active parameters, 2) prevents forgetting in continual learning of new tasks, and 3) enables efficient task transfer, offering a promising solution for advanced robotic applications. Demos and codes can be found in https://forrest-110.github.io/sparse_diffusion_policy/.
翻译:机器人任务日益复杂,对多任务与持续学习的高效策略提出了更高要求。传统模型通常依赖单一通用策略处理所有任务,面临计算成本高昂、学习新任务时出现灾难性遗忘等挑战。为解决这些问题,我们提出一种稀疏、可复用且灵活的策略——稀疏扩散策略(SDP)。该策略在基于Transformer的扩散策略中引入混合专家模型(MoE),通过选择性激活专家模块与技能单元,实现无需重新训练整体模型的高效任务特定学习。SDP不仅显著减少了激活参数负担,还支持专家模块在不同任务间的无缝集成与复用。在仿真与真实场景的多样化任务上进行的大量实验表明,SDP具有以下优势:1)在多任务场景中表现优异,且激活参数增量可忽略不计;2)在新任务的持续学习中有效防止遗忘现象;3)支持高效的任务迁移能力,为先进机器人应用提供了具有前景的解决方案。演示与代码详见 https://forrest-110.github.io/sparse_diffusion_policy/。