Grasping an object when it is in an ungraspable pose is a challenging task, such as books or other large flat objects placed horizontally on a table. Inspired by human manipulation, we address this problem by pushing the object to the edge of the table and then grasping it from the hanging part. In this paper, we develop a model-free Deep Reinforcement Learning framework to synergize pushing and grasping actions. We first pre-train a Variational Autoencoder to extract high-dimensional features of input scenario images. One Proximal Policy Optimization algorithm with the common reward and sharing layers of Actor-Critic is employed to learn both pushing and grasping actions with high data efficiency. Experiments show that our one network policy can converge 2.5 times faster than the policy using two parallel networks. Moreover, the experiments on unseen objects show that our policy can generalize to the challenging case of objects with curved surfaces and off-center irregularly shaped objects. Lastly, our policy can be transferred to a real robot without fine-tuning by using CycleGAN for domain adaption and outperforms the push-to-wall baseline.
翻译:当物体处于不可抓取姿态时(如书本或其他大型扁平物体水平放置在桌面上),抓取操作是一项具有挑战性的任务。受人类操作行为的启发,我们通过将物体推至桌面边缘并从悬空部分进行抓取来解决这一问题。本文开发了一种无模型深度强化学习框架,用于协同推与抓操作。首先,我们预训练变分自编码器以提取输入场景图像的高维特征。采用基于共享奖励函数及Actor-Critic共享层的近端策略优化算法,以高数据效率学习推与抓两种操作。实验表明,单网络策略的收敛速度比双并行网络策略快2.5倍。此外,针对未见物体的实验显示,我们的策略可泛化至曲面物体及非中心不规则形状物体等挑战性场景。最后,通过使用CycleGAN进行领域自适应,该策略可直接迁移至真实机器人而无需微调,且性能优于"推至墙壁"基线方法。