In sim-to-real Reinforcement Learning (RL), a policy is trained in a simulated environment and then deployed on the physical system. The main challenge of sim-to-real RL is to overcome the reality gap - the discrepancies between the real world and its simulated counterpart. Using general geometric representations, such as convex decomposition, triangular mesh, signed distance field can improve simulation fidelity, and thus potentially narrow the reality gap. Common to these approaches is that many contact points are generated for geometrically-complex objects, which slows down simulation and may cause numerical instability. Contact reduction methods address these issues by limiting the number of contact points, but the validity of these methods for sim-to-real RL has not been confirmed. In this paper, we present a contact reduction method with bounded stiffness to improve the simulation accuracy. Our experiments show that the proposed method critically enables training RL policy for a tight-clearance double pin insertion task and successfully deploying the policy on a rigid, position-controlled physical robot.
翻译:在模拟到现实(Sim-to-Real)强化学习(RL)中,策略在模拟环境中训练,然后部署到物理系统上。模拟到现实强化学习的主要挑战是克服现实差距——即现实世界与其模拟版本之间的差异。使用通用几何表示,例如凸分解、三角网格、符号距离场,可以提高模拟保真度,从而有可能缩小现实差距。这些方法的共同点是,对于几何形状复杂的对象,会生成许多接触点,这会减慢模拟速度,并可能导致数值不稳定性。接触简化方法通过限制接触点的数量来解决这些问题,但这些方法在模拟到现实强化学习中的有效性尚未得到证实。在本文中,我们提出了一种具有有界刚度的接触简化方法,以提高模拟精度。我们的实验表明,所提出的方法关键性地实现了针对紧公差双销插入任务的强化学习策略训练,并成功地将该策略部署到刚性、位置控制的物理机器人上。