Achieving highly accurate dynamic or simulator models that are close to the real robot can facilitate model-based controls (e.g., model predictive control or linear-quadradic regulators), model-based trajectory planning (e.g., trajectory optimization), and decrease the amount of learning time necessary for reinforcement learning methods. Thus, the objective of this work is to learn the residual errors between a dynamic and/or simulator model and the real robot. This is achieved using a neural network, where the parameters of a neural network are updated through an Unscented Kalman Filter (UKF) formulation. Using this method, we model these residual errors with only small amounts of data -- a necessity as we improve the simulator/dynamic model by learning directly from real-world operation. We demonstrate our method on robotic hardware (e.g., manipulator arm, and a wheeled robot), and show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
翻译:获得接近真实机器人的高精度动力学或仿真模型,有助于实现基于模型的控制(如模型预测控制或线性二次型调节器)、基于模型的轨迹规划(如轨迹优化),并缩短强化学习方法所需的学习时间。因此,本研究旨在学习动力学模型和/或仿真模型与真实机器人之间的残差。我们采用神经网络实现这一目标,并通过无迹卡尔曼滤波(UKF)方法更新神经网络参数。通过该方法,我们仅用少量数据即可对这些残差进行建模——这直接通过从实际运行中学习来改进仿真/动力学模型时必不可少。我们在机器人硬件上(如机械臂和轮式机器人)验证了该方法,结果表明,通过学习得到的残差,我们能够进一步缩小动力学模型、仿真与实际硬件之间的现实差距。