Modern autonomous drone missions increasingly require software frameworks capable of seamlessly integrating structured symbolic planning with adaptive reinforcement learning (RL). Although traditional rule-based architectures offer robust structured reasoning for drone autonomy, their capabilities fall short in dynamically complex operational environments that require adaptive symbolic planning. Symbolic RL (SRL), using the Planning Domain Definition Language (PDDL), explicitly integrates domain-specific knowledge and operational constraints, significantly improving the reliability and safety of unmanned aerial vehicle (UAV) decision making. In this study, we propose the AMAD-SRL framework, an extended and refined version of the Autonomous Mission Agents for Drones (AMAD) cognitive multi-agent architecture, enhanced with symbolic reinforcement learning for dynamic mission planning and execution. We validated our framework in a Software-in-the-Loop (SIL) environment structured identically to an intended Hardware-In-the-Loop Simulation (HILS) platform, ensuring seamless transition to real hardware. Experimental results demonstrate stable integration and interoperability of modules, successful transitions between BDI-driven and symbolic RL-driven planning phases, and consistent mission performance. Specifically, we evaluate a target acquisition scenario in which the UAV plans a surveillance path followed by a dynamic reentry path to secure the target while avoiding threat zones. In this SIL evaluation, mission efficiency improved by approximately 75% over a coverage-based baseline, measured by travel distance reduction. This study establishes a robust foundation for handling complex UAV missions and discusses directions for further enhancement and validation.
翻译:现代自主无人机任务日益需要能够无缝整合结构化符号规划与自适应强化学习(RL)的软件框架。尽管传统的基于规则的架构为无人机自主性提供了稳健的结构化推理能力,但其在需要自适应符号规划的动态复杂操作环境中存在不足。符号强化学习(SRL)利用规划领域定义语言(PDDL),显式地集成了领域特定知识与操作约束,显著提升了无人机(UAV)决策的可靠性与安全性。本研究提出AMAD-SRL框架,该框架是对自主无人机任务代理(AMAD)认知多智能体架构的扩展与改进版本,通过集成符号强化学习以增强动态任务规划与执行能力。我们在一个与预定硬件在环仿真(HILS)平台结构完全相同的软件在环(SIL)环境中验证了该框架,确保了向真实硬件的无缝过渡。实验结果表明,各模块实现了稳定的集成与互操作性,成功完成了BDI驱动与符号RL驱动规划阶段之间的切换,并保持了持续的任务性能。具体而言,我们评估了一个目标获取场景,其中无人机规划了一条监视路径,随后规划了一条动态再入路径以在避开威胁区域的同时锁定目标。在此SIL评估中,通过旅行距离的减少来衡量,任务效率相较于基于覆盖的基线方法提升了约75%。本研究为处理复杂无人机任务奠定了坚实基础,并探讨了进一步优化与验证的方向。