Reinforcement learning (RL) controllers have made impressive progress in humanoid locomotion and light-weight object manipulation. However, achieving robust and precise motion control with intense force interaction remains a significant challenge. To address these limitations, this paper proposes HAFO, a dual-agent reinforcement learning framework that concurrently optimizes both a robust locomotion strategy and a precise upper-body manipulation strategy via coupled training. We employ a constrained residual action space to improve dual-agent training stability and sample efficiency. The external tension disturbances are explicitly modeled using a spring-damper system, allowing for fine-grained force control through manipulation of the virtual spring. In this process, the reinforcement learning policy autonomously generates a disturbance-rejection response by utilizing environmental feedback. The experimental results demonstrate that HAFO achieves whole-body control for humanoid robots across diverse force-interaction environments using a single dual-agent policy, delivering outstanding performance under load-bearing and thrust-disturbance conditions, while maintaining stable operation even under rope suspension state.
翻译:强化学习控制器在人形机器人运动与轻量级物体操控方面已取得显著进展。然而,在强力量交互场景下实现鲁棒且精确的运动控制仍面临重大挑战。为突破这些局限,本文提出HAFO——一种通过耦合训练同时优化鲁棒运动策略与精确上身操控策略的双智能体强化学习框架。我们采用约束残差动作空间以提升双智能体训练的稳定性与样本效率。外部张力扰动通过弹簧-阻尼系统进行显式建模,使得通过操控虚拟弹簧实现精细的力控制成为可能。在此过程中,强化学习策略利用环境反馈自主生成抗扰响应。实验结果表明,HAFO通过单一的双智能体策略实现了人形机器人在多样化力量交互环境中的全身控制,在负重与推力扰动条件下均表现出卓越性能,并能在绳索悬吊状态下保持稳定运行。