The ability to push large objects in a goal-directed manner using onboard egocentric perception is an essential skill for humanoid robots to perform complex tasks such as material handling in warehouses. To robustly manipulate heavy objects to arbitrary goal configurations, the robot must cope with unknown object mass and ground friction, noisy onboard perception, and actuation errors; all in a real-time feedback loop. Existing solutions either rely on privileged object-state information without onboard perception or lack robustness to variations in goal configurations and object physical properties. In this work, we present VOFA, a visual goal-conditioned humanoid loco-manipulation system capable of pushing objects with unknown physical properties to arbitrary goal positions. VOFA consists of a two-level hierarchical architecture with a high-level visuomotor policy and a low-level force-adaptive whole-body controller. The high-level policy processes noisy onboard observations and generates goal-conditioned commands to operate in closed loop across diverse object-goal configurations, while the low-level whole-body controller provides robustness to variations in object physical properties. VOFA is extensively evaluated in both simulation and real-world experiments on the Booster T1 humanoid robot. Our results demonstrate strong performance, achieving over 90% success in simulation and over 80% success in real-world trials. Moreover, VOFA successfully pushes objects weighing up to 17kg, exceeding half of the Booster T1's body weight.
翻译:利用机载自我中心感知以目标导向方式推动大型物体,是仿人机器人在仓库物料搬运等复杂任务中所需的关键能力。为稳健地将重物操纵至任意目标配置,机器人必须在实时反馈循环中应对未知物体质量与地面摩擦、噪声机载感知以及执行误差。现有解决方案要么依赖无机载感知的特权物体状态信息,要么缺乏对目标配置和物体物理属性变化的鲁棒性。本文提出 VOFA——一种视觉目标条件的仿人机器人移动操纵系统,能够将物理属性未知的物体推动至任意目标位置。VOFA 采用两级分层架构:高层视觉运动策略与底层力自适应全身控制器。高层策略处理噪声机载观测,生成目标条件指令以在跨物体-目标配置的闭环中运行;底层全身控制器则提供对物体物理属性变化的鲁棒性。我们在 Booster T1 仿人机器人上通过仿真和真实实验对 VOFA 进行了全面评估。结果表明其性能强劲,仿真成功率超过 90%,真实实验成功率超过 80%。此外,VOFA 成功推动了重达 17kg 的物体——超过 Booster T1 自身重量的一半。