In a future with autonomous robots, visual and spatial perception is of utmost importance for robotic systems. Particularly for aerial robotics, there are many applications where utilizing visual perception is necessary for any real-world scenarios. Robotic aerial grasping using drones promises fast pick-and-place solutions with a large increase in mobility over other robotic solutions. Utilizing Mask R-CNN scene segmentation (detectron2), we propose a vision-based system for autonomous rapid aerial grasping which does not rely on markers for object localization and does not require the appearence of the object to be previously known. Combining segmented images with spatial information from a depth camera, we generate a dense point cloud of the detected objects and perform geometry-based grasp planning to determine grasping points on the objects. In real-world experiments on a dynamically grasping aerial platform, we show that our system can replicate the performance of a motion capture system for object localization up to 94.5% of the baseline grasping success rate. With our results, we show the first use of geometry-based grasping techniques with a flying platform and aim to increase the autonomy of existing aerial manipulation platforms, bringing them further towards real-world applications in warehouses and similar environments.
翻译:在未来自主机器人时代,视觉与空间感知对机器人系统至关重要。特别在飞行机器人领域,诸多实际应用场景必须依赖视觉感知。相较其他机器人方案,利用无人机进行空中抓取能实现快速拾放操作,并显著提升机动性。我们提出基于Mask R-CNN场景分割(detectron2)的视觉系统,实现无需标记物定位、无需预设物体外观的自主快速空中抓取。通过将分割图像与深度相机空间信息融合,我们生成检测目标的密集点云,并运用基于几何的抓取规划算法确定抓取点。在动态抓取飞行平台的实际实验中,该系统复现了运动捕捉系统的定位性能,基准抓取成功率达94.5%。本研究首次将基于几何的抓取技术应用于飞行平台,旨在提升现有空中操控平台的自主性,推动其在仓库及类似环境的实际应用。