Autonomous vehicles are suited for continuous area patrolling problems. However, finding an optimal patrolling strategy can be challenging for many reasons. Firstly, patrolling environments are often complex and can include unknown environmental factors, such as wind or landscape. Secondly, autonomous vehicles can have failures or hardware constraints, such as limited battery life. Importantly, patrolling large areas often requires multiple agents that need to collectively coordinate their actions. In this work, we consider these limitations and propose an approach based on model-free, deep multi-agent reinforcement learning. In this approach, the agents are trained to patrol an environment with various unknown dynamics and factors. They can automatically recharge themselves to support continuous collective patrolling. A distributed homogeneous multi-agent architecture is proposed, where all patrolling agents execute identical policies locally based on their local observations and shared location information. This architecture provides a patrolling system that can tolerate agent failures and allow supplementary agents to be added to replace failed agents or to increase the overall patrol performance. The solution is validated through simulation experiments from multiple perspectives, including the overall patrol performance, the efficiency of battery recharging strategies, the overall fault tolerance, and the ability to cooperate with supplementary agents.
翻译:自主车辆适用于连续区域巡逻问题。然而,由于多种原因,寻找最优巡逻策略可能具有挑战性。首先,巡逻环境通常复杂且可能包含未知环境因素,如风或地形。其次,自主车辆可能出现故障或存在硬件限制,例如电池寿命有限。重要的是,大面积巡逻通常需要多个智能体协同行动。本工作考虑这些限制,提出了一种基于无模型深度多智能体强化学习的方法。在该方法中,智能体被训练在具有各种未知动态和因素的环境中巡逻。它们能够自动充电以支持持续的集体巡逻。提出了一种分布式同构多智能体架构,其中所有巡逻智能体根据局部观测和共享位置信息在本地执行相同策略。该架构提供了一种能够容忍智能体故障的巡逻系统,并允许添加补充智能体以替换故障智能体或提升整体巡逻性能。通过仿真实验从多个角度验证了该解决方案,包括整体巡逻性能、电池充电策略效率、整体容错能力以及与补充智能体的协作能力。