This paper proposes an adaptive near-hover position controller for quadcopters, which can be deployed to quadcopters of very different mass, size and motor constants, and also shows rapid adaptation to unknown disturbances during runtime. The core algorithmic idea is to learn a single policy that can adapt online at test time not only to the disturbances applied to the drone, but also to the robot dynamics and hardware in the same framework. We achieve this by training a neural network to estimate a latent representation of the robot and environment parameters, which is used to condition the behaviour of the controller, also represented as a neural network. We train both networks exclusively in simulation with the goal of flying the quadcopters to goal positions and avoiding crashes to the ground. We directly deploy the same controller trained in the simulation without any modifications on two quadcopters in the real world with differences in mass, size, motors, and propellers with mass differing by 4.5 times. In addition, we show rapid adaptation to sudden and large disturbances up to one-third of the mass of the quadcopters. We perform an extensive evaluation in both simulation and the physical world, where we outperform a state-of-the-art learning-based adaptive controller and a traditional PID controller specifically tuned to each platform individually. Video results can be found at https://youtu.be/U-c-LbTfvoA.
翻译:本文提出了一种自适应近悬停位置控制器,可部署于质量、尺寸和电机常数差异极大的四旋翼飞行器,并在运行时表现出对未知扰动的快速适应能力。核心算法思路是学习单一策略,该策略不仅能在测试时在线适应施加于无人机的扰动,还能在同一框架内适应机器人动力学特性与硬件参数。我们通过训练神经网络来估计机器人及环境参数的潜在表征,并利用该表征调节同样以神经网络表示的控制器行为。两个网络均在仿真环境中训练,目标为引导四旋翼飞行器到达目标位置并避免坠地。我们将仿真训练得到的同一控制器未经任何修改直接部署于现实世界中质量、尺寸、电机和螺旋桨存在差异(质量相差4.5倍)的两架四旋翼飞行器上。此外,我们展示了该控制器对高达四旋翼飞行器质量三分之一幅度的突发大扰动的快速适应能力。我们在仿真与物理世界进行了广泛评估,其性能超越了最先进的基于学习的自适应控制器以及针对各平台单独调参的传统PID控制器。视频结果见 https://youtu.be/U-c-LbTfvoA。