The performance of robots in high-level tasks depends on the quality of their lower-level controller, which requires fine-tuning. However, the intrinsically nonlinear dynamics and controllers make tuning a challenging task when it is done by hand. In this paper, we present DiffTune, a novel, gradient-based automatic tuning framework. We formulate the controller tuning as a parameter optimization problem. Our method unrolls the dynamical system and controller as a computational graph and updates the controller parameters through gradient-based optimization. The gradient is obtained using sensitivity propagation, which is the only method for gradient computation when tuning for a physical system instead of its simulated counterpart. Furthermore, we use $\mathcal{L}_1$ adaptive control to compensate for the uncertainties (that unavoidably exist in a physical system) such that the gradient is not biased by the unmodelled uncertainties. We validate the DiffTune on a Dubin's car and a quadrotor in challenging simulation environments. In comparison with state-of-the-art auto-tuning methods, DiffTune achieves the best performance in a more efficient manner owing to its effective usage of the first-order information of the system. Experiments on tuning a nonlinear controller for quadrotor show promising results, where DiffTune achieves 3.5x tracking error reduction on an aggressive trajectory in only 10 trials over a 12-dimensional controller parameter space.
翻译:机器人在高层任务中的性能取决于其低层控制器的质量,而低层控制器需要精细调参。然而,当手动调参时,本质非线性的动力学和控制器使得调参成为一项具有挑战性的任务。本文提出DiffTune,一种新颖的基于梯度的自动调参框架。我们将控制器调参建模为参数优化问题:该方法将动力学系统与控制器展开为计算图,并通过基于梯度的优化更新控制器参数。梯度通过灵敏度传播获得——这是针对物理系统(而非其仿真模型)进行调参时唯一可行的梯度计算方法。此外,我们采用L1自适应控制补偿物理系统中不可避免存在的不确定性,从而避免梯度被未建模不确定性所偏置。我们在具有挑战性的仿真环境中,在杜宾车和四旋翼飞行器上验证了DiffTune。与最先进的自动调参方法相比,DiffTune通过有效利用系统的一阶信息,以更高效的方式实现了最优性能。在四旋翼非线性控制器调参实验中,DiffTune展现出令人鼓舞的结果:在12维控制器参数空间中仅通过10次试验,即在激进轨迹上实现了3.5倍的跟踪误差降低。