This paper presents Adaptive Whole-body Loco-Manipulation, AdaptManip, a fully autonomous framework for humanoid robots to perform integrated navigation, object lifting, and delivery. Unlike prior imitation learning-based approaches that rely on human demonstrations and are often brittle to disturbances, AdaptManip aims to train a robust loco-manipulation policy via reinforcement learning without human demonstrations or teleoperation data. The proposed framework consists of three coupled components: (1) a recurrent object state estimator that tracks the manipulated object in real time under limited field-of-view and occlusions; (2) a whole-body base policy for robust locomotion with residual manipulation control for stable object lifting and delivery; and (3) a LiDAR-based robot global position estimator that provides drift-robust localization. All components are trained in simulation using reinforcement learning and deployed on real hardware in a zero-shot manner. Experimental results show that AdaptManip significantly outperforms baseline methods, including imitation learning-based approaches, in adaptability and overall success rate, while accurate object state estimation improves manipulation performance even under occlusion. We further demonstrate fully autonomous real-world navigation, object lifting, and delivery on a humanoid robot.
翻译:本文提出自适应全身移动操作框架AdaptManip,该全自主框架使人形机器人能够执行集成导航、物体抓取与递送任务。与先前依赖人类演示且对干扰敏感的模仿学习方法不同,AdaptManip旨在通过强化学习训练鲁棒的移动操作策略,无需人类演示或遥操作数据。该框架包含三个耦合组件:(1) 在有限视野和遮挡条件下实时追踪被操作物体的循环物体状态估计器;(2) 具有残差操作控制能力的全身基座策略,用于实现鲁棒移动及稳定物体抓取递送;(3) 基于LiDAR的机器人全局位置估计器,提供抗漂移的定位能力。所有组件均通过强化学习在仿真环境中训练,并以零样本方式部署至实体硬件。实验结果表明,AdaptManip在适应性和整体成功率上显著优于包括模仿学习方法在内的基线方法,而精确的物体状态估计即使在遮挡条件下也能提升操作性能。我们进一步在人形机器人上实现了全自主的真实世界导航、物体抓取与递送任务。