Diffusion-based policies have recently shown strong results in robot manipulation, but their extension to multi-task scenarios is hindered by the high cost of scaling model size and demonstrations. We introduce Skill Mixture-of-Experts Policy (SMP), a diffusion-based mixture-of-experts policy that learns a compact orthogonal skill basis and uses sticky routing to compose actions from a small, task-relevant subset of experts at each step. A variational training objective supports this design, and adaptive expert activation at inference yields fast sampling without oversized backbones. We validate SMP in simulation and on a real dual-arm platform with multi-task learning and transfer learning tasks, where SMP achieves higher success rates and markedly lower inference cost than large diffusion baselines. These results indicate a practical path toward scalable, transferable multi-task manipulation: learn reusable skills once, activate only what is needed, and adapt quickly when tasks change.
翻译:基于扩散的策略最近在机器人操作中展现出优异性能,但其向多任务场景的扩展受限于模型规模与演示数据的高昂扩展成本。本文提出技能专家混合策略(SMP),这是一种基于扩散的专家混合策略,它学习紧凑正交的技能基,并通过粘性路由机制在每一步从少量任务相关专家子集中组合动作。变分训练目标支持该设计,而推理阶段的自适应专家激活实现了无需超大骨干网络的快速采样。我们在仿真环境及真实双臂平台上通过多任务学习与迁移学习任务验证SMP,结果表明SMP相比大型扩散基线方法获得更高成功率,且推理成本显著降低。这些发现为可扩展、可迁移的多任务操作提供了一条实用路径:一次性学习可复用技能,仅激活所需模块,并在任务变更时快速适应。