This paper presents a decentralized leader-follower multi-robot formation control based on a reinforcement learning (RL) algorithm applied to a swarm of small educational Sphero robots. Since the basic Q-learning method is known to require large memory resources for Q-tables, this work implements the Double Deep Q-Network (DDQN) algorithm, which has achieved excellent results in many robotic problems. To enhance the system behavior, we trained two different DDQN models, one for reaching the formation and the other for maintaining it. The models use a discrete set of robot motions (actions) to adapt the continuous nonlinear system to the discrete nature of RL. The presented approach has been tested in simulation and real experiments which show that the multi-robot system can achieve and maintain a stable formation without the need for complex mathematical models and nonlinear control laws.
翻译:本文提出了一种基于强化学习的去中心化领航-跟随式多机器人编队控制方法,并将其应用于小型教育Sphero机器人群体。由于基础Q学习算法需要为Q表分配大量内存资源,本研究采用了双深度Q网络算法——该算法已在诸多机器人问题中取得优异成果。为优化系统性能,我们分别训练了两个DDQN模型:一个用于实现编队到达,另一个用于维持编队稳定。模型通过离散机器人动作集,使连续非线性系统适配强化学习的离散特性。所提方法经过仿真实验和真实场景验证,结果表明多机器人系统无需复杂数学模型和非线性控制律即可实现并保持稳定编队。