Online Tool Selection with Learned Grasp Prediction Models

Deep learning-based grasp prediction models have become an industry standard for robotic bin-picking systems. To maximize pick success, production environments are often equipped with several end-effector tools that can be swapped on-the-fly, based on the target object. Tool-change, however, takes time. Choosing the order of grasps to perform, and corresponding tool-change actions, can improve system throughput; this is the topic of our work. The main challenge in planning tool change is uncertainty - we typically cannot see objects in the bin that are currently occluded. Inspired by queuing and admission control problems, we model the problem as a Markov Decision Process (MDP), where the goal is to maximize expected throughput, and we pursue an approximate solution based on model predictive control, where at each time step we plan based only on the currently visible objects. Special to our method is the idea of void zones, which are geometrical boundaries in which an unknown object will be present, and therefore cannot be accounted for during planning. Our planning problem can be solved using integer linear programming (ILP). However, we find that an approximate solution based on sparse tree search yields near optimal performance at a fraction of the time. Another question that we explore is how to measure the performance of tool-change planning: we find that throughput alone can fail to capture delicate and smooth behavior, and propose a principled alternative. Finally, we demonstrate our algorithms on both synthetic and real world bin picking tasks.

翻译：基于深度学习的抓取预测模型已成为机器人料箱拾取系统的行业标准。为最大化拾取成功率，生产环境通常配备多种末端执行器工具，这些工具可根据目标物体实时更换。然而，工具更换需要耗费时间。通过选择执行抓取的顺序及相应的工具更换动作，可提升系统吞吐量——这正是本工作的研究主题。工具更换规划的主要挑战在于不确定性：我们通常无法观察到料箱中当前被遮挡的物体。受排队论与准入控制问题的启发，我们将此问题建模为马尔可夫决策过程（MDP），目标为最大化期望吞吐量，并采用基于模型预测控制的近似求解方案——在每个时间步仅根据当前可见物体进行规划。本方法的独特之处在于"真空区"概念：即未知物体必然存在的几何边界区域，该区域在规划中无法被纳入考量。我们的规划问题可通过整数线性规划（ILP）求解，但研究发现，基于稀疏树搜索的近似方法能在极短时间内实现接近最优的性能。另一个探讨的问题是工具更换规划的度量标准：我们发现仅凭吞吐量无法捕捉精细平滑的行为特性，并提出了一个具有理论依据的替代方案。最终，我们在合成数据与实际料箱拾取任务中验证了所提算法的有效性。