The proliferation of unmanned aerial vehicles (UAVs) in controlled airspace presents significant risks, including potential collisions, disruptions to air traffic, and security threats. Ensuring the safe and efficient operation of airspace, particularly in urban environments and near critical infrastructure, necessitates effective methods to intercept unauthorized or non-cooperative UAVs. This work addresses the critical need for robust, adaptive systems capable of managing such threats through the use of Reinforcement Learning (RL). We present a novel approach utilizing RL to train fixed-wing UAV pursuer agents for intercepting dynamic evader targets. Our methodology explores both model-based and model-free RL algorithms, specifically DreamerV3, Truncated Quantile Critics (TQC), and Soft Actor-Critic (SAC). The training and evaluation of these algorithms were conducted under diverse scenarios, including unseen evasion strategies and environmental perturbations. Our approach leverages high-fidelity flight dynamics simulations to create realistic training environments. This research underscores the importance of developing intelligent, adaptive control systems for UAV interception, significantly contributing to the advancement of secure and efficient airspace management. It demonstrates the potential of RL to train systems capable of autonomously achieving these critical tasks.
翻译:管制空域中无人驾驶飞行器(UAV)的激增带来了重大风险,包括潜在的碰撞、对空中交通的干扰以及安全威胁。为确保空域,特别是在城市环境和关键基础设施附近的安全高效运行,需要有效的方法来拦截未经授权或非合作无人机。本研究通过使用强化学习(RL),致力于满足对能够管理此类威胁的鲁棒、自适应系统的迫切需求。我们提出了一种利用强化学习训练固定翼无人机追击器智能体以拦截动态规避目标的新方法。我们的方法探索了基于模型和无模型的强化学习算法,具体包括DreamerV3、截断分位数批评器(TQC)和柔性演员-评论家(SAC)。这些算法的训练和评估在多种场景下进行,包括未见过的规避策略和环境扰动。我们的方法利用高保真飞行动力学仿真来创建逼真的训练环境。本研究强调了开发用于无人机拦截的智能自适应控制系统的重要性,为安全高效的空域管理发展做出了重要贡献。它展示了强化学习在训练能够自主完成这些关键任务的系统方面的潜力。