The paper presents a technique using reinforcement learning (RL) to adapt the control gains of a quadcopter controller. Specifically, we employed Proximal Policy Optimization (PPO) to train a policy which adapts the gains of a cascaded feedback controller in-flight. The primary goal of this controller is to minimize tracking error while following a specified trajectory. The paper's key objective is to analyze the effectiveness of the adaptive gain policy and compare it to the performance of a static gain control algorithm, where the Integral Squared Error and Integral Time Squared Error are used as metrics. The results show that the adaptive gain scheme achieves over 40$\%$ decrease in tracking error as compared to the static gain controller.
翻译:本文提出了一种利用强化学习自适应调整四旋翼飞行器控制器增益的技术。具体而言,我们采用近端策略优化(PPO)训练一个策略,使其能够在飞行中自适应调整级联反馈控制器的增益。该控制器的核心目标是使飞行器沿指定轨迹运动时的跟踪误差最小化。本文的主要目的在于分析自适应增益策略的有效性,并将其与静态增益控制算法的性能进行对比,其中积分平方误差和积分时间平方误差被用作评估指标。结果表明,与静态增益控制器相比,自适应增益方案的跟踪误差降低了40%以上。