Mobile grasping enhances manipulation efficiency by utilizing robots' mobility. This study aims to enable a commercial off-the-shelf robot for mobile grasping, requiring precise timing and pose adjustments. Self-supervised learning can develop a generalizable policy to adjust the robot's velocity and determine grasp position and orientation based on the target object's shape and pose. Due to mobile grasping's complexity, action primitivization and step-by-step learning are crucial to avoid data sparsity in learning from trial and error. This study simplifies mobile grasping into two grasp action primitives and a moving action primitive, which can be operated with limited degrees of freedom for the manipulator. This study introduces three fully convolutional neural network (FCN) models to predict static grasp primitive, dynamic grasp primitive, and residual moving velocity error from visual inputs. A two-stage grasp learning approach facilitates seamless FCN model learning. The ablation study demonstrated that the proposed method achieved the highest grasping accuracy and pick-and-place efficiency. Furthermore, randomizing object shapes and environments in the simulation effectively achieved generalizable mobile grasping.
翻译:移动抓取通过利用机器人的移动性来提升操作效率。本研究旨在使商用现成机器人具备移动抓取能力,这需要精确的时序控制与姿态调整。自监督学习能够训练出通用策略,根据目标物体的形状与位姿来调整机器人速度并确定抓取位置与朝向。鉴于移动抓取的复杂性,动作基元化与分步学习对于避免试错学习中的数据稀疏问题至关重要。本研究将移动抓取简化为两个抓取动作基元和一个移动动作基元,这些基元可在机械臂自由度受限的情况下执行。本文提出了三种全卷积神经网络(FCN)模型,分别用于根据视觉输入预测静态抓取基元、动态抓取基元以及移动速度残差误差。采用两阶段抓取学习方法实现了FCN模型的无缝训练。消融实验表明,所提方法获得了最高的抓取准确率与抓放效率。此外,在仿真环境中随机化物体形状与场景布局,有效实现了可泛化的移动抓取。