Trajectory optimization methods have achieved an exceptional level of performance on real-world robots in recent years. These methods heavily rely on accurate analytical models of the dynamics, yet some aspects of the physical world can only be captured to a limited extent. An alternative approach is to leverage machine learning techniques to learn a differentiable dynamics model of the system from data. In this work, we use trajectory optimization and model learning for performing highly dynamic and complex tasks with robotic systems in absence of accurate analytical models of the dynamics. We show that a neural network can model highly nonlinear behaviors accurately for large time horizons, from data collected in only 25 minutes of interactions on two distinct robots: (i) the Boston Dynamics Spot and an (ii) RC car. Furthermore, we use the gradients of the neural network to perform gradient-based trajectory optimization. In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car, and gives good performance in combination with trajectory optimization methods.
翻译:轨迹优化方法近年来在现实机器人上取得了卓越的性能表现。这些方法严重依赖精确的动力学解析模型,然而物理世界的某些方面仅能被有限地建模。另一种方法是利用机器学习技术从数据中学习系统的可微动力学模型。在本工作中,我们采用轨迹优化与模型学习相结合的方法,在缺乏精确动力学解析模型的情况下,完成机器人系统的高度动态化复杂任务。研究表明,仅通过两台不同机器人(波士顿动力Spot与无线电遥控汽车)上25分钟交互收集的数据,神经网络即可准确建模大时间跨度下的高度非线性行为。我们进一步利用神经网络的梯度执行基于梯度的轨迹优化。硬件实验表明,所习得模型能够表征Spot与无线电遥控汽车的复杂动力学特性,并与轨迹优化方法结合展现出优异性能。