Efficient multi-robot task allocation (MRTA) is fundamental to various time-sensitive applications such as disaster response, warehouse operations, and construction. This paper tackles a particular class of these problems that we call MRTA-collective transport or MRTA-CT -- here tasks present varying workloads and deadlines, and robots are subject to flight range, communication range, and payload constraints. For large instances of these problems involving 100s-1000's of tasks and 10s-100s of robots, traditional non-learning solvers are often time-inefficient, and emerging learning-based policies do not scale well to larger-sized problems without costly retraining. To address this gap, we use a recently proposed encoder-decoder graph neural network involving Capsule networks and multi-head attention mechanism, and innovatively add topological descriptors (TD) as new features to improve transferability to unseen problems of similar and larger size. Persistent homology is used to derive the TD, and proximal policy optimization is used to train our TD-augmented graph neural network. The resulting policy model compares favorably to state-of-the-art non-learning baselines while being much faster. The benefit of using TD is readily evident when scaling to test problems of size larger than those used in training.
翻译:高效的多机器人任务分配(MRTA)是灾害响应、仓储作业和建筑施工等多种时间敏感应用的基础。本文针对一类特殊问题——MRTA集体运输(MRTA-CT)展开研究:在此类问题中,任务具有不同的工作负载和截止时间,而机器人需满足飞行距离、通信距离和有效载荷等约束条件。对于涉及成百上千个任务和数十至数百个机器人的大规模问题实例,传统非学习型求解器往往时间效率低下,而新兴的基于学习的策略若不进行代价高昂的再训练,则难以有效扩展至更大规模问题。为弥补这一不足,我们采用近期提出的、结合胶囊网络与多头注意力机制的编码器-解码器图神经网络,并创新性地引入拓扑描述符(TD)作为新特征,以提升对未见过的相似及更大规模问题的迁移能力。通过持续同调推导拓扑描述符,并采用近端策略优化训练经拓扑描述符增强的图神经网络。由此得到的策略模型在保持更高运算速度的同时,性能优于最先进的非学习基线方法。当扩展至大于训练规模的问题实例进行测试时,拓扑描述符的优势尤为显著。