Make Tracking Easy: Neural Motion Retargeting for Humanoid Whole-body Control

Humanoid robots require diverse motor skills to integrate into complex environments, but bridging the kinematic and dynamic embodiment gap from human data remains a major bottleneck. We demonstrate through Hessian analysis that traditional optimization-based retargeting is inherently non-convex and prone to local optima, leading to physical artifacts like joint jumps and self-penetration. To address this, we reformulate the targeting problem as learning data distribution rather than optimizing optimal solutions, where we propose NMR, a Neural Motion Retargeting framework that transforms static geometric mapping into a dynamics-aware learned process. We first propose Clustered-Expert Physics Refinement (CEPR), a hierarchical data pipeline that leverages VAE-based motion clustering to group heterogeneous movements into latent motifs. This strategy significantly reduces the computational overhead of massively parallel reinforcement learning experts, which project and repair noisy human demonstrations onto the robot's feasible motion manifold. The resulting high-fidelity data supervises a non-autoregressive CNN-Transformer architecture that reasons over global temporal context to suppress reconstruction noise and bypass geometric traps. Experiments on the Unitree G1 humanoid across diverse dynamic tasks (e.g., martial arts, dancing) show that NMR eliminates joint jumps and significantly reduces self-collisions compared to state-of-the-art baselines. Furthermore, NMR-generated references accelerate the convergence of downstream whole-body control policies, establishing a scalable path for bridging the human-robot embodiment gap.

翻译：人形机器人需要多样的运动技能才能融入复杂的环境中，但从人类数据中弥合运动学和动力学具身差异仍是一个主要瓶颈。通过Hessian分析，我们证明传统的基于优化的重定向方法本质上是非凸的，容易陷入局部最优解，从而导致关节跳跃和自穿透等物理伪影。为解决此问题，我们将重定向问题重新定义为学习数据分布而非优化最优解，并提出了神经运动重定向框架NMR，该框架将静态几何映射转化为动力学感知的学习过程。我们首先提出聚类专家物理精化方法CEPR，这是一种分层数据管道，利用基于VAE的运动聚类将异质运动分组为潜在模式。该策略显著降低了大规模并行强化学习专家的计算开销，这些专家将嘈杂的人类演示投影并修复到机器人的可行运动流形上。由此产生的高保真数据用于训练一个非自回归的CNN-Transformer架构，该架构通过全局时序上下文推理来抑制重建噪声并规避几何陷阱。在宇树G1人形机器人上的多种动态任务（如武术、舞蹈）实验表明，与最先进的基线方法相比，NMR消除了关节跳跃并显著减少了自碰撞。此外，NMR生成的参考信号加速了下游全身控制策略的收敛，为弥合人机具身差距建立了一条可扩展的路径。