The prevailing grasp prediction methods predominantly rely on offline learning, overlooking the dynamic grasp learning that occurs during real-time adaptation to novel picking scenarios. These scenarios may involve previously unseen objects, variations in camera perspectives, and bin configurations, among other factors. In this paper, we introduce a novel approach, SSL-ConvSAC, that combines semi-supervised learning and reinforcement learning for online grasp learning. By treating pixels with reward feedback as labeled data and others as unlabeled, it efficiently exploits unlabeled data to enhance learning. In addition, we address the imbalance between labeled and unlabeled data by proposing a contextual curriculum-based method. We ablate the proposed approach on real-world evaluation data and demonstrate promise for improving online grasp learning on bin picking tasks using a physical 7-DoF Franka Emika robot arm with a suction gripper. Video: https://youtu.be/OAro5pg8I9U
翻译:主流的抓取预测方法主要依赖离线学习,忽视了在实时适应新型拣选场景(如遇到未见物体、相机视角变化及料箱配置差异等)过程中发生的动态抓取学习。本文提出一种结合半监督学习与强化学习的新型方法SSL-ConvSAC,用于在线抓取学习。该方法将有奖励反馈的像素视为标记数据,其余像素作为未标记数据,从而高效利用未标记数据增强学习能力。此外,针对标记数据与未标记数据的不平衡问题,我们提出一种基于情境课程的方法。通过对实际评估实验进行消融研究,我们证明了该方法在使用配备吸盘抓取的7自由度Franka Emika机械臂的料箱抓取任务中,能显著提升在线抓取学习性能。视频链接:https://youtu.be/OAro5pg8I9U