Although end-to-end robot learning has shown some success for robot manipulation, the learned policies are often not sufficiently robust to variations in object pose or geometry. To improve the policy generalization, we introduce spatially-grounded parameterized motion primitives in our method HACMan++. Specifically, we propose an action representation consisting of three components: what primitive type (such as grasp or push) to execute, where the primitive will be grounded (e.g. where the gripper will make contact with the world), and how the primitive motion is executed, such as parameters specifying the push direction or grasp orientation. These three components define a novel discrete-continuous action space for reinforcement learning. Our framework enables robot agents to learn to chain diverse motion primitives together and select appropriate primitive parameters to complete long-horizon manipulation tasks. By grounding the primitives on a spatial location in the environment, our method is able to effectively generalize across object shape and pose variations. Our approach significantly outperforms existing methods, particularly in complex scenarios demanding both high-level sequential reasoning and object generalization. With zero-shot sim-to-real transfer, our policy succeeds in challenging real-world manipulation tasks, with generalization to unseen objects. Videos can be found on the project website: https://sgmp-rss2024.github.io.
翻译:尽管端到端机器人学习在机器人操作方面取得了一定成功,但习得的策略通常对物体位姿或几何形状的变化缺乏足够的鲁棒性。为提升策略的泛化能力,我们在HACMan++方法中引入了空间锚定的参数化运动基元。具体而言,我们提出了一种包含三个组件的动作表征:执行何种基元类型(如抓取或推动)、基元将在何处锚定(例如夹爪将与世界发生接触的位置),以及基元运动如何执行(如指定推动方向或抓取姿态的参数)。这三个组件共同定义了一个新颖的离散-连续动作空间用于强化学习。我们的框架使机器人智能体能够学习将多样化运动基元链接起来,并选择合适的基元参数以完成长时程操作任务。通过将基元锚定在环境中的空间位置,我们的方法能够有效泛化至不同物体形状和位姿的变化。本方法显著优于现有方法,尤其在需要高层序列推理和物体泛化能力的复杂场景中。通过零样本仿真到真实迁移,我们的策略在具有挑战性的真实世界操作任务中取得成功,并能泛化至未见过的物体。视频可在项目网站查看:https://sgmp-rss2024.github.io。