Developing robot controllers capable of achieving dexterous nonprehensile manipulation, such as pushing an object on a table, is challenging. The underactuated and hybrid-dynamics nature of the problem, further complicated by the uncertainty resulting from the frictional interactions, requires sophisticated control behaviors. Reinforcement Learning (RL) is a powerful framework for developing such robot controllers. However, previous RL literature addressing the nonprehensile pushing task achieves low accuracy, non-smooth trajectories, and only simple motions, i.e. without rotation of the manipulated object. We conjecture that previously used unimodal exploration strategies fail to capture the inherent hybrid-dynamics of the task, arising from the different possible contact interaction modes between the robot and the object, such as sticking, sliding, and separation. In this work, we propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies for arbitrary starting and target object poses, i.e. positions and orientations, and with improved accuracy. We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers. Furthermore, we validate the transferability of the learned policies, trained entirely in simulation, to a physical robot hardware using the KUKA iiwa robot arm. See our supplemental video: https://youtu.be/vTdva1mgrk4.
翻译:开发能够实现灵巧非预抓取操作(如推动桌面物体)的机器人控制器极具挑战性。该问题的欠驱动与混合动力学特性,加之摩擦相互作用带来的不确定性,使得其需要复杂的控制策略。强化学习(RL)是开发此类机器人控制器的有效框架。然而,针对非预抓取推动任务的现有RL文献存在精度低、轨迹不平滑、仅能实现简单运动(即不包含被操作物体的旋转)等问题。我们推测,先前使用的单模态探索策略未能捕捉任务中由机器人-物体间不同接触交互模式(如粘滞、滑动与分离)产生的固有混合动力学特性。本研究提出一种基于分类分布的多模态探索方法,该方法能训练平面推动RL策略,使其适用于任意起始与目标物体位姿(即位置与方向),并显著提升精度。实验表明,习得策略对外部扰动与观测噪声具有鲁棒性,且可扩展至多推进器任务。此外,我们验证了完全在仿真环境中训练的策略可迁移至基于KUKA iiwa机器人臂的物理硬件平台。补充视频见:https://youtu.be/vTdva1mgrk4。