This study highlights the potential of image-based reinforcement learning methods for addressing swarm-related tasks. In multi-agent reinforcement learning, effective policy learning depends on how agents sense, interpret, and process inputs. Traditional approaches often rely on handcrafted feature extraction or raw vector-based representations, which limit the scalability and efficiency of learned policies concerning input order and size. In this work we propose an image-based reinforcement learning method for decentralized control of a multi-agent system, where observations are encoded as structured visual inputs that can be processed by Neural Networks, extracting its spatial features and producing novel decentralized motion control rules. We evaluate our approach on a multi-agent convergence task of agents with limited-range and bearing-only sensing that aim to keep the swarm cohesive during the aggregation. The algorithm's performance is evaluated against two benchmarks: an analytical solution proposed by Bellaiche and Bruckstein, which ensures convergence but progresses slowly, and VariAntNet, a neural network-based framework that converges much faster but shows medium success rates in hard constellations. Our method achieves high convergence, with a pace nearly matching that of VariAntNet. In some scenarios, it serves as the only practical alternative.
翻译:本研究凸显了基于图像的强化学习方法在解决集群相关任务方面的潜力。在多智能体强化学习中,有效的策略学习取决于智能体如何感知、解释和处理输入。传统方法通常依赖于手工特征提取或基于原始向量的表示,这限制了所学策略在输入顺序和大小方面的可扩展性和效率。在本工作中,我们提出了一种基于图像的强化学习方法,用于多智能体系统的去中心化控制,其中观测被编码为结构化视觉输入,可由神经网络处理,从而提取其空间特征并生成新颖的去中心化运动控制规则。我们在一个多智能体聚合任务上评估了我们的方法,该任务中的智能体具有有限距离和仅测向感知能力,旨在聚集过程中保持集群的凝聚力。该算法的性能通过两个基准进行评估:一是Bellaiche和Bruckstein提出的解析解,该解确保收敛但进展缓慢;二是VariAntNet,一种基于神经网络的框架,其收敛速度快得多,但在困难构型中成功率中等。我们的方法实现了高收敛性,其速度几乎与VariAntNet相当。在某些场景下,它是唯一可行的替代方案。