The Multi-Agent Path Finding (MAPF) problem entails finding collision-free paths for a set of agents, guiding them from their start to goal locations. However, MAPF does not account for several practical task-related constraints. For example, agents may need to perform actions at goal locations with specific execution times, adhering to predetermined orders and timeframes. Moreover, goal assignments may not be predefined for agents, and the optimization objective may lack an explicit definition. To incorporate task assignment, path planning, and a user-defined objective into a coherent framework, this paper examines the Task Assignment and Path Finding with Precedence and Temporal Constraints (TAPF-PTC) problem. We augment Conflict-Based Search (CBS) to simultaneously generate task assignments and collision-free paths that adhere to precedence and temporal constraints, maximizing an objective quantified by the return from a user-defined reward function in reinforcement learning (RL). Experimentally, we demonstrate that our algorithm, CBS-TA-PTC, can solve highly challenging bomb-defusing tasks with precedence and temporal constraints efficiently relative to MARL and adapted Target Assignment and Path Finding (TAPF) methods.
翻译:多智能体路径规划(MAPF)问题涉及为一组智能体寻找无碰撞路径,引导其从起点到达目标位置。然而,MAPF并未考虑若干实际任务相关约束。例如,智能体可能需要在目标位置以特定执行时间执行动作,并遵循预定的顺序和时间框架。此外,智能体的目标分配可能未预先定义,且优化目标可能缺乏明确界定。为将任务分配、路径规划和用户定义目标整合到统一框架中,本文研究了具有优先关系和时间约束的任务分配与路径规划(TAPF-PTC)问题。我们扩展了基于冲突的搜索(CBS)算法,以同时生成任务分配和满足优先关系与时间约束的无碰撞路径,并通过强化学习(RL)中用户定义的奖励函数量化回报来最大化目标。实验表明,与MARL及改进的目标分配与路径规划(TAPF)方法相比,我们的算法CBS-TA-PTC能够高效解决具有优先关系和时间约束的高难度拆弹任务。