The object manipulation capabilities of quadruped robots is an open research challenge. While previous studies have focused on low-level policy learning, task execution still relies on expert-designed high-level trajectories. Autonomous selection of both an affordable interaction point on the target object and an affordable robot base pose removes the need for pre-designed trajectories. This study proposes a three-level hierarchical reinforcement learning (RL) framework that utilizes pose affordances to guide the navigation policy, while the navigation policy drives the locomotion policy. In addition, the pedipulation policy is guided by interaction-point affordances, enabling object-centric pose alignment of the quadruped robot and effective end-effector manipulation planning. We train the proposed framework in the IsaacSim ecosystem and evaluate it in both simulation and real-world settings. We investigate the effectiveness of pose affordance across multiple scenarios in simulation while various object interaction tasks are validated on real-world setting forming an object-interaction dataset. The results show that the proposed framework can autonomously identify candidate poses based on their affordance and successfully execute object manipulation tasks in the real world without human guidance.
翻译:四足机器人的物体操作能力是一个开放性的研究挑战。尽管先前的研究集中于底层策略学习,但任务执行仍依赖于专家设计的高层轨迹。自主选择目标物体上的可行交互点以及可行的机器人基座姿态,消除了对预设计轨迹的需求。本研究提出了一种三层分层强化学习框架,利用姿态可供性引导导航策略,而导航策略驱动运动策略。此外,腿部操作策略通过交互点可供性进行引导,实现了四足机器人的物体中心姿态对齐和有效的末端执行器操控规划。我们在IsaacSim生态系统中训练所提出的框架,并在仿真和真实环境两种场景下进行评估。我们在仿真中探究了姿态可供性在多个场景下的有效性,同时验证了多种物体交互任务在真实环境中的表现,构建了一个物体交互数据集。结果表明,所提出的框架能够基于可供性自主识别候选姿态,并在无需人工引导的情况下成功在真实世界中执行物体操作任务。