Robotics research has been focusing on cooperative multi-agent problems, where agents must work together and communicate to achieve a shared objective. To tackle this challenge, we explore imitation learning algorithms. These methods learn a controller by observing demonstrations of an expert, such as the behaviour of a centralised omniscient controller, which can perceive the entire environment, including the state and observations of all agents. Performing tasks with complete knowledge of the state of a system is relatively easy, but centralised solutions might not be feasible in real scenarios since agents do not have direct access to the state but only to their observations. To overcome this issue, we train end-to-end Neural Networks that take as input local observations obtained from an omniscient centralised controller, i.e., the agents' sensor readings and the communications received, producing as output the action to be performed and the communication to be transmitted. This study concentrates on two cooperative tasks using a distributed controller: distributing the robots evenly in space and colouring them based on their position relative to others. While an explicit exchange of messages between the agents is required to solve the second task, in the first one, a communication protocol is unnecessary, although it may increase performance. The experiments are run in Enki, a high-performance open-source simulator for planar robots, which provides collision detection and limited physics support for robots evolving on a flat surface. Moreover, it can simulate groups of robots hundreds of times faster than real-time. The results show how applying a communication strategy improves the performance of the distributed model, letting it decide which actions to take almost as precisely and quickly as the expert controller.
翻译:机器人学领域的研究长期聚焦于合作性多智能体问题,其中智能体必须通过协作与通信达成共同目标。为应对这一挑战,我们探索了模仿学习算法。此类方法通过观察专家示范来学习控制器——例如观察一个能感知整个环境(包括所有智能体状态与观测)的全知集中控制器。虽然基于系统完整状态信息执行任务相对简单,但在现实场景中,智能体无法直接获取全局状态信息,仅能获得局部观测。为解决该问题,我们训练了端到端神经网络,该网络以全知集中控制器提供的局部观测(即智能体传感器读数与所接收通信信息)为输入,输出需执行的动作及需传输的通信内容。本研究聚焦于使用分布式控制器的两项合作任务:均匀空间分布机器人,以及根据机器人相对位置进行颜色编码。在第二项任务中,智能体间必须进行显式消息交换;而第一项任务虽无需通信协议,但采用通信策略仍可提升性能。实验在Enki(一款面向平面机器人的高性能开源仿真器)中完成,该仿真器提供碰撞检测功能及有限物理支持,且能以超过实时数百倍的速度模拟机器人群体。实验结果表明,应用通信策略可提升分布式模型性能,使其在决策精度与响应速度上几乎接近专家控制器的水平。