TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning

Recent Vision-Language-Action models show potential to generalize across embodiments but struggle to quickly align with a new robot's action space when high-quality demonstrations are scarce, especially for bipedal humanoids. We present TrajBooster, a cross-embodiment framework that leverages abundant wheeled-humanoid data to boost bipedal VLA. Our key idea is to use end-effector trajectories as a morphology-agnostic interface. TrajBooster (i) extracts 6D dual-arm end-effector trajectories from real-world wheeled humanoids, (ii) retargets them in simulation to Unitree G1 with a whole-body controller trained via a heuristic-enhanced harmonized online DAgger to lift low-dimensional trajectory references into feasible high-dimensional whole-body actions, and (iii) forms heterogeneous triplets that couple source vision/language with target humanoid-compatible actions to post-pre-train a VLA, followed by only 10 minutes of teleoperation data collection on the target humanoid domain. Deployed on Unitree G1, our policy achieves beyond-tabletop household tasks, enabling squatting, cross-height manipulation, and coordinated whole-body motion with markedly improved robustness and generalization. Results show that TrajBooster allows existing wheeled-humanoid data to efficiently strengthen bipedal humanoid VLA performance, reducing reliance on costly same-embodiment data while enhancing action space understanding and zero-shot skill transfer capabilities. For more details, For more details, please refer to our \href{https://jiachengliu3.github.io/TrajBooster/}.

翻译：[翻译摘要]：近期视觉-语言-动作模型展现出跨形态泛化的潜力，但在高质量演示数据稀缺（尤其针对双足人形机器人）时难以快速适应新机器人的动作空间。本文提出跨形态框架TrajBooster，利用丰富的轮式人形机器人数据增强双足人形VLA模型性能。核心思路是将末端执行器轨迹作为与形态无关的接口。TrajBooster通过以下步骤实现：（i）从真实轮式人形机器人中提取六自由度双臂末端执行器轨迹；（ii）通过启发式增强的在线DAgger训练全身控制器，将低维轨迹参考映射为可行的全身高维动作，并在仿真中将轨迹重定向至Unitree G1；（iii）构建异构三元组，将源视觉/语言信息与目标人形机器人兼容动作耦合以预训练VLA模型，随后仅需10分钟遥操作数据采集进行目标人形域微调。部署于Unitree G1上，本策略实现超越桌面级家庭任务，支持蹲姿、跨高度操控及协调全身运动，鲁棒性与泛化能力显著提升。实验表明，TrajBooster可利用现有轮式人形数据高效增强双足人形VLA性能，在减少对昂贵同构数据依赖的同时提升动作空间理解与零样本技能迁移能力。更多详情请参阅：\href{https://jiachengliu3.github.io/TrajBooster/}