Continuous trajectory tracking control of quadrotors is complicated when considering noise from the environment. Due to the difficulty in modeling the environmental dynamics, tracking methodologies based on conventional control theory, such as model predictive control, have limitations on tracking accuracy and response time. We propose a Time-attenuating Twin Delayed DDPG, a model-free algorithm that is robust to noise, to better handle the trajectory tracking task. A deep reinforcement learning framework is constructed, where a time decay strategy is designed to avoid trapping into local optima. The experimental results show that the tracking error is significantly small, and the operation time is one-tenth of that of a traditional algorithm. The OpenAI Mujoco tool is used to verify the proposed algorithm, and the simulation results show that, the proposed method can significantly improve the training efficiency and effectively improve the accuracy and convergence stability.
翻译:考虑环境噪声时,四旋翼飞行器的连续轨迹跟踪控制较为复杂。由于环境动力学难以建模,基于传统控制理论(如模型预测控制)的跟踪方法在跟踪精度和响应时间方面存在局限性。我们提出一种时间衰减式双延迟DDPG算法,这是一种对噪声鲁棒的无模型算法,用于更好地处理轨迹跟踪任务。构建了深度强化学习框架,其中设计了时间衰减策略以避免陷入局部最优。实验结果表明,跟踪误差显著较小,且运行时间为传统算法的十分之一。使用OpenAI MuJoCo工具验证所提算法,仿真结果显示,该方法能显著提升训练效率,并有效提高精度与收敛稳定性。