Dynamic grasping of moving objects in complex, continuous motion scenarios remains challenging. Reinforcement Learning (RL) has been applied in various robotic manipulation tasks, benefiting from its closed-loop property. However, existing RL-based methods do not fully explore the potential for enhancing visual representations. In this letter, we propose a novel framework called Grasps As Points for RL (GAP-RL) to effectively and reliably grasp moving objects. By implementing a fast region-based grasp detector, we build a Grasp Encoder by transforming 6D grasp poses into Gaussian points and extracting grasp features as a higher-level abstraction than the original object point features. Additionally, we develop a Graspable Region Explorer for real-world deployment, which searches for consistent graspable regions, enabling smoother grasp generation and stable policy execution. To assess the performance fairly, we construct a simulated dynamic grasping benchmark involving objects with various complex motions. Experiment results demonstrate that our method effectively generalizes to novel objects and unseen dynamic motions compared to other baselines. Real-world experiments further validate the framework's sim-to-real transferability.
翻译:在复杂、连续运动场景中动态抓取移动物体仍然具有挑战性。强化学习(RL)凭借其闭环特性,已被应用于各种机器人操作任务。然而,现有的基于RL的方法未能充分挖掘增强视觉表征的潜力。在本研究中,我们提出了一种称为“将抓取视为点用于强化学习”(GAP-RL)的新颖框架,以实现高效可靠的移动物体抓取。通过实现一个快速的基于区域的抓取检测器,我们将6D抓取姿态转换为高斯点,并提取抓取特征作为比原始物体点特征更高层次的抽象,从而构建了一个抓取编码器。此外,我们开发了一个用于实际部署的可抓取区域探索器,它能够搜索一致的可抓取区域,从而实现更平滑的抓取生成和更稳定的策略执行。为了公平评估性能,我们构建了一个模拟动态抓取基准测试,涉及具有各种复杂运动模式的物体。实验结果表明,与其他基线方法相比,我们的方法能有效地泛化到新物体和未见过的动态运动。真实世界实验进一步验证了该框架从仿真到现实的迁移能力。