Recent successes in applying reinforcement learning (RL) for robotics has shown it is a viable approach for constructing robotic controllers. However, RL controllers can produce many collisions in environments where new obstacles appear during execution. This poses a problem in safety-critical settings. We present a hybrid approach, called iKinQP-RL, that uses an Inverse Kinematics Quadratic Programming (iKinQP) controller to correct actions proposed by an RL policy at runtime. This ensures safe execution in the presence of new obstacles not present during training. Preliminary experiments illustrate our iKinQP-RL framework completely eliminates collisions with new obstacles while maintaining a high task success rate.
翻译:近年来,强化学习在机器人控制领域的成功应用表明其是构建机器人控制器的一种可行方法。然而,在执行过程中出现新障碍物的环境中,强化学习控制器可能产生大量碰撞。这在安全关键场景中构成了问题。我们提出了一种名为 iKinQP-RL 的混合方法,该方法在运行时使用逆运动学二次规划控制器来校正由强化学习策略提出的动作。这确保了在训练期间未出现过的新障碍物存在情况下的安全执行。初步实验表明,我们的 iKinQP-RL 框架在保持高任务成功率的同时,完全消除了与新障碍物的碰撞。