A common theme in robot assembly is the adoption of Manipulation Primitives as the atomic motion to compose assembly strategy, typically in the form of a state machine or a graph. While this approach has shown great performance and robustness in increasingly complex assembly tasks, the state machine has to be engineered manually in most cases. Such hard-coded strategies will fail to handle unexpected situations that are not considered in the design. To address this issue, we propose to find dynamics sequence of manipulation primitives through Reinforcement Learning. Leveraging parameterized manipulation primitives, the proposed method greatly improves both assembly performance and sample efficiency of Reinforcement Learning compared to a previous work using non-parameterized manipulation primitives. In practice, our method achieves good zero-shot sim-to-real performance on high-precision peg insertion tasks with different geometry, clearance, and material.
翻译:机器人装配中一个常见的设计范式是将操作基元作为原子运动来构建装配策略,通常采用状态机或图结构的形式。尽管这种方法在日益复杂的装配任务中展现出卓越的性能与鲁棒性,但状态机在大多数情况下仍需人工设计。这种硬编码策略难以应对设计阶段未考虑的突发状况。针对该问题,我们提出通过强化学习来学习操作基元的动态序列。相较于先前采用非参数化操作基元的研究,本方法通过引入参数化操作基元,显著提升了装配性能与强化学习的样本效率。实验表明,该方法在具有不同几何形状、间隙尺寸和材料特性的高精度轴孔插入任务中,实现了优秀的零样本仿真到现实迁移性能。