Recent works in robotic manipulation through reinforcement learning (RL) or imitation learning (IL) have shown potential for tackling a range of tasks e.g., opening a drawer or a cupboard. However, these techniques generalize poorly to unseen objects. We conjecture that this is due to the high-dimensional action space for joint control. In this paper, we take an alternative approach and separate the task of learning 'what to do' from 'how to do it' i.e., whole-body control. We pose the RL problem as one of determining the skill dynamics for a disembodied virtual manipulator interacting with articulated objects. The whole-body robotic kinematic control is optimized to execute the high-dimensional joint motion to reach the goals in the workspace. It does so by solving a quadratic programming (QP) model with robotic singularity and kinematic constraints. Our experiments on manipulating complex articulated objects show that the proposed approach is more generalizable to unseen objects with large intra-class variations, outperforming previous approaches. The evaluation results indicate that our approach generates more compliant robotic motion and outperforms the pure RL and IL baselines in task success rates. Additional information and videos are available at https://kl-research.github.io/decoupskill
翻译:近期通过强化学习或模仿学习进行机器人操作的研究展示了其处理诸如开抽屉或开柜子等任务的潜力。然而,这些技术对未见过的物体泛化能力较差。我们推测这是由于关节控制的高维动作空间所致。本文采用替代性方法,将"做什么"的任务学习与"如何做"(即全身控制)分离。我们将强化学习问题构建为确定一个与非铰接物体交互的非实体虚拟机械臂的技能动力学问题。全身机器人运动学控制通过优化以执行高维关节运动,从而在操作空间中到达目标点。该过程通过求解带机器人奇异性与运动学约束的二次规划模型实现。我们在操作复杂铰接物体上的实验表明,所提方法对具有大类别内差异的未见物体具有更强的泛化能力,优于先前方法。评估结果显示,我们的方法能生成更柔顺的机器人运动,并在任务成功率上优于纯强化学习和模仿学习基线。更多信息与视频可参阅 https://kl-research.github.io/decoupskill