This paper presents a novel reinforcement learning framework for trajectory tracking of unmanned aerial vehicles in cluttered environments using a dual-agent architecture. Traditional optimization methods for trajectory tracking face significant computational challenges and lack robustness in dynamic environments. Our approach employs deep reinforcement learning (RL) to overcome these limitations, leveraging 3D pointcloud data to perceive the environment without relying on memory-intensive obstacle representations like occupancy grids. The proposed system features two RL agents: one for predicting UAV velocities to follow a reference trajectory and another for managing collision avoidance in the presence of obstacles. This architecture ensures real-time performance and adaptability to uncertainties. We demonstrate the efficacy of our approach through simulated and real-world experiments, highlighting improvements over state-of-the-art RL and optimization-based methods. Additionally, a curriculum learning paradigm is employed to scale the algorithms to more complex environments, ensuring robust trajectory tracking and obstacle avoidance in both static and dynamic scenarios.
翻译:本文提出了一种新颖的强化学习框架,用于在复杂环境中实现无人机的轨迹跟踪,该框架采用双智能体架构。传统的轨迹跟踪优化方法面临显著的计算挑战,且在动态环境中缺乏鲁棒性。我们的方法采用深度强化学习来克服这些限制,利用三维点云数据感知环境,而无需依赖如占据栅格等内存密集型的障碍物表示。所提出的系统包含两个强化学习智能体:一个用于预测无人机速度以跟踪参考轨迹,另一个用于在存在障碍物时管理避碰。该架构确保了实时性能和对不确定性的适应能力。我们通过仿真和真实世界实验证明了该方法的有效性,其性能优于当前最先进的基于强化学习和优化的方法。此外,采用课程学习范式将算法扩展到更复杂的环境,确保了在静态和动态场景中均能实现鲁棒的轨迹跟踪与避障。