As reinforcement learning for humanoid robots evolves from single-task to multi-skill paradigms, efficiently expanding new skills while avoiding catastrophic forgetting has become a key challenge in embodied intelligence. Existing approaches either rely on complex topology adjustments in Mixture-of-Experts (MoE) models or require training extremely large-scale models, making lightweight deployment difficult. To address this, we propose Tree Learning, a multi-skill continual learning framework for humanoid robots. The framework adopts a root-branch hierarchical parameter inheritance mechanism, providing motion priors for branch skills through parameter reuse to fundamentally prevent catastrophic forgetting. A multi-modal feedforward adaptation mechanism combining phase modulation and interpolation is designed to support both periodic and aperiodic motions. A task-level reward shaping strategy is also proposed to accelerate skill convergence. Unity-based simulation experiments show that, in contrast to simultaneous multi-task training, Tree Learning achieves higher rewards across various representative locomotion skills while maintaining a 100% skill retention rate, enabling seamless multi-skill switching and real-time interactive control. We further validate the performance and generalization capability of Tree Learning on two distinct Unity-simulated tasks: a Super Mario-inspired interactive scenario and autonomous navigation in a classical Chinese garden environment.
翻译:随着人形机器人强化学习从单任务向多技能范式演进,如何在避免灾难性遗忘的同时高效扩展新技能已成为具身智能的核心挑战。现有方法要么依赖混合专家(MoE)模型中复杂的拓扑结构调整,要么需要训练超大规模模型,难以实现轻量化部署。为此,我们提出树学习(Tree Learning)——一种面向人形机器人的多技能持续学习框架。该框架采用根-分支层级参数继承机制,通过参数重用为分支技能提供运动先验,从根本上防止灾难性遗忘。设计结合相位调制与插值的多模态前馈自适应机制,同时支持周期性与非周期性运动。还提出任务级奖励塑形策略以加速技能收敛。基于Unity的仿真实验表明,与同步多任务训练相比,树学习在各类典型运动技能上获得更高奖励的同时保持100%的技能保留率,实现无缝多技能切换与实时交互控制。我们进一步在两类不同的Unity仿真任务中验证了树学习的性能与泛化能力:受《超级马里奥》启发的交互场景与经典中国园林环境中的自主导航。