For a humanoid robot to be deployed in the real world, the choice of command space (i.e., the interface between task planning and whole-body control) is crucial. Existing whole-body controllers typically demand dense kinematic or spatial references that planners struggle to synthesize from task semantics. We instead propose a compact, explicit interface that is intuitive, general, modular, and expressive enough for diverse manipulation skills. To this end, we introduce HANDOFF, a single humanoid whole-body controller that follows this interface and is distilled via multi-teacher KL distillation under a context-conditioned gating scheme into a mixture-of-experts student from three complementary specialists: whole-body motion tracking with safety-filtered data, locomotion, and fall-recovery. On the Unitree G1, HANDOFF matches state-of-the-art velocity tracking and offers one of the largest robust manipulation workspaces. We further demonstrate hardware feasibility through multiple natural-language-driven task roll-outs, powered by a VLM-driven agentic planner with no task-specific data or controller fine-tuning.
翻译:为了实现人形机器人在现实世界中的部署,指令空间(即任务规划与全身控制之间的接口)的选择至关重要。现有的全身控制器通常需要密集的运动学或空间参考信息,而规划器难以从任务语义中综合生成这些信息。我们提出了一种紧凑、明确的接口,该接口具有直观、通用、模块化且表达能力强的特点,足以支持多样化的操作技能。为此,我们引入了HANDOFF——一个遵循该接口的单一人形全身控制器,通过多教师KL散度蒸馏和上下文条件门控机制,从三个互补专家(带安全滤波数据的全身运动跟踪、运动行走与跌倒恢复)中蒸馏出一个混合专家学生模型。在Unitree G1上,HANDOFF达到了与现有最先进方法相当的线速度跟踪性能,并提供了目前最鲁棒的大范围操作工作空间之一。我们进一步通过由VLM驱动的智能体规划器(无需任务特定数据或控制器微调)支持的多项自然语言驱动任务演示,验证了硬件的可行性。