While recent advances have demonstrated strong performance in individual humanoid skills such as upright locomotion, fall recovery and whole-body coordination, learning a single policy that masters all these skills remains challenging due to the diverse dynamics and conflicting control objectives involved. To address this, we introduce X-Loco, a framework for training a vision-based generalist humanoid locomotion policy. X-Loco trains multiple oracle specialist policies and adopts a synergetic policy distillation with a case-adaptive specialist selection mechanism, which dynamically leverages multiple specialist policies to guide a vision-based student policy. This design enables the student to acquire a broad spectrum of locomotion skills, ranging from fall recovery to terrain traversal and whole-body coordination skills. To the best of our knowledge, X-Loco is the first framework to demonstrate vision-based humanoid locomotion that jointly integrates upright locomotion, whole-body coordination and fall recovery, while operating solely under velocity commands without relying on reference motions. Experimental results show that X-Loco achieves superior performance, demonstrated by tasks such as fall recovery and terrain traversal. Ablation studies further highlight that our framework effectively leverages specialist expertise and enhances learning efficiency.
翻译:尽管近期研究在直立行走、跌倒恢复及全身协调等单项人形机器人技能中展现出卓越性能,但由于运动模式多样性与控制目标冲突,学习单一策略以掌握所有技能仍具挑战性。为此,我们提出X-Loco框架——一种基于视觉的通用人形运动策略训练方案。该框架通过训练多个专家策略,并采用协同策略蒸馏机制与案例自适应专家选择模块,动态整合多专家策略以指导基于视觉的学生策略。这一设计使学生策略能够掌握从跌倒恢复到地形跨越、全身协调等广泛运动技能。据我们所知,X-Loco是首个在仅依赖速度指令且无需参考运动的情况下,实现直立行走、全身协调与跌倒恢复相融合的视觉人形运动控制框架。实验结果表明,X-Loco在跌倒恢复与地形跨越等任务中展现出卓越性能。消融研究进一步证实,我们的框架有效利用了专家策略知识并提升了学习效率。