Biological agents, such as humans and animals, are capable of making decisions out of a very large number of choices in a limited time. They can do so because they use their prior knowledge to find a solution that is not necessarily optimal but good enough for the given task. In this work, we study the motion coordination of multiple drones under the above-mentioned paradigm, Bounded Rationality (BR), to achieve cooperative motion planning tasks. Specifically, we design a prior policy that provides useful goal-directed navigation heuristics in familiar environments and is adaptive in unfamiliar ones via Reinforcement Learning augmented with an environment-dependent exploration noise. Integrating this prior policy in the game-theoretic bounded rationality framework allows agents to quickly make decisions in a group considering other agents' computational constraints. Our investigation assures that agents with a well-informed prior policy increase the efficiency of the collective decision-making capability of the group. We have conducted rigorous experiments in simulation and in the real world to demonstrate that the ability of informed agents to navigate to the goal safely can guide the group to coordinate efficiently under the BR framework.
翻译:生物智能体(如人类和动物)能够在有限时间内从大量选项中做出决策,其原因在于它们利用先验知识找到的未必是最优、但对当前任务已足够好的解决方案。本研究在上述有限理性(Bounded Rationality, BR)范式下,研究了多架无人机的运动协调问题,以实现协同运动规划任务。具体而言,我们设计了一种先验策略:在熟悉环境中提供目标导向的有效导航启发式信息,在陌生环境中则通过引入与环境相关的探索噪声的强化学习实现自适应。将该先验策略融入博弈论有限理性框架,使得智能体能在考虑其他智能体计算约束的情况下快速做出群体决策。我们的研究证实,具备良好先验策略的智能体能提升群体集体决策能力的效率。我们进行了严格的仿真与真实世界实验,证明在BR框架下,具备先验知识的智能体安全导航至目标的能力可引导群体实现高效协调。