Recent works in robotic manipulation through reinforcement learning (RL) or imitation learning (IL) have shown potential for tackling a range of tasks e.g., opening a drawer or a cupboard. However, these techniques generalize poorly to unseen objects. We conjecture that this is due to the high-dimensional action space for joint control. In this paper, we take an alternative approach and separate the task of learning 'what to do' from 'how to do it' i.e., whole-body control. We pose the RL problem as one of determining the skill dynamics for a disembodied virtual manipulator interacting with articulated objects. The whole-body robotic kinematic control is optimized to execute the high-dimensional joint motion to reach the goals in the workspace. It does so by solving a quadratic programming (QP) model with robotic singularity and kinematic constraints. Our experiments on manipulating complex articulated objects show that the proposed approach is more generalizable to unseen objects with large intra-class variations, outperforming previous approaches. The evaluation results indicate that our approach generates more compliant robotic motion and outperforms the pure RL and IL baselines in task success rates.
翻译:通过强化学习或模仿学习进行机械臂操作的最新研究已展现出处理一系列任务(如打开抽屉或橱柜)的潜力。然而,这些方法对未见物体的泛化能力较差。我们推测这是由于关节控制的高维动作空间所致。本文提出替代方法,将"要做什么"的学习任务与"如何执行"的任务(即全身控制)相分离。我们将强化学习问题设定为确定与铰接物体交互的非实体虚拟机械臂的技能动力学,并通过求解包含机械臂奇异性和运动学约束的二次规划模型,优化全身机器人的运动学控制以执行高维关节运动来达到工作空间中的目标。我们操作复杂铰接物体的实验表明,所提方法对具有较大类内差异的未见物体具有更强的泛化能力,优于先前方法。评估结果表明,本方法可生成更柔顺的机械臂运动,在任务成功率上显著优于纯强化学习和模仿学习基线方法。