This work addresses the challenge of enabling a team of quadrupedal robots to collaboratively tow a cable-connected load through cluttered and unstructured environments while avoiding obstacles. Leveraging cables allows the multi-robot system to navigate narrow spaces by maintaining slack when necessary. However, this introduces hybrid physical interactions due to alternating taut and slack states, with computational complexity that scales exponentially as the number of agents increases. To tackle these challenges, we developed a scalable and decentralized system capable of dynamically coordinating a variable number of quadrupedal robots while managing the hybrid physical interactions inherent in the load-towing task. At the core of this system is a novel multi-agent reinforcement learning (MARL)-based planner, designed for decentralized coordination. The MARL-based planner is trained using a centralized training with decentralized execution (CTDE) framework, enabling each robot to make decisions autonomously using only local (ego) observations. To accelerate learning and ensure effective collaboration across varying team sizes, we introduce a tailored training curriculum for MARL. Experimental results highlight the flexibility and scalability of the framework, demonstrating successful deployment with one to four robots in real-world scenarios and up to twelve robots in simulation. The decentralized planner maintains consistent inference times, regardless of the team size. Additionally, the proposed system demonstrates robustness to environment perturbations and adaptability to varying load weights. This work represents a step forward in achieving flexible and efficient multi-legged robotic collaboration in complex and real-world environments.
翻译:本研究致力于解决四足机器人团队在杂乱无章的非结构化环境中协同牵引缆绳连接负载并规避障碍的难题。利用缆绳连接,多机器人系统可通过在必要时保持松弛状态来穿越狭窄空间。然而,这引入了因缆绳张紧与松弛状态交替而产生的混合物理交互,其计算复杂度随智能体数量增加呈指数级增长。为应对这些挑战,我们开发了一个可扩展的去中心化系统,能够动态协调数量可变的四足机器人,同时管理负载牵引任务中固有的混合物理交互。该系统的核心是一个基于多智能体强化学习(MARL)的新型规划器,专为去中心化协同而设计。该基于MARL的规划器采用集中训练分散执行(CTDE)框架进行训练,使得每个机器人仅利用局部(自身)观测即可自主决策。为加速学习并确保不同团队规模下的有效协作,我们为MARL引入了一种定制化的训练课程。实验结果凸显了该框架的灵活性与可扩展性,成功在现实场景中部署了1至4台机器人,并在仿真中实现了多达12台机器人的协同。该去中心化规划器无论团队规模大小,均能保持一致的推理时间。此外,所提出的系统展现出对环境扰动的鲁棒性以及对不同负载重量的适应性。此项工作为实现复杂现实环境中灵活高效的多足机器人协作迈出了重要一步。