Mass casualty incidents (MCIs) are a growing concern, characterized by complexity and uncertainty that demand adaptive decision-making strategies. The victim tagging step in the emergency medical response must be completed quickly and is crucial for providing information to guide subsequent time-constrained response actions. In this paper, we present a mathematical formulation of multi-agent victim tagging to minimize the time it takes for responders to tag all victims. Five distributed heuristics are formulated and evaluated with simulation experiments. The heuristics considered are on-the go, practical solutions that represent varying levels of situational uncertainty in the form of global or local communication capabilities, showcasing practical constraints. We further investigate the performance of a multi-agent reinforcement learning (MARL) strategy, factorized deep Q-network (FDQN), to minimize victim tagging time as compared to baseline heuristics. Extensive simulations demonstrate that between the heuristics, methods with local communication are more efficient for adaptive victim tagging, specifically choosing the nearest victim with the option to replan. Analyzing all experiments, we find that our FDQN approach outperforms heuristics in smaller-scale scenarios, while heuristics excel in more complex scenarios. Our experiments contain diverse complexities that explore the upper limits of MARL capabilities for real-world applications and reveal key insights.
翻译:大规模伤亡事件日益受到关注,其复杂性和不确定性要求采用自适应决策策略。应急医疗响应中的伤员标记步骤必须快速完成,且对指导后续时间受限的响应行动至关重要。本文提出了多智能体伤员标记的数学建模方法,以最小化响应者标记所有伤员所需时间。我们通过仿真实验构建并评估了五种分布式启发式算法。这些算法考虑了具有实际可行性的即时解决方案,通过全局或局部通信能力体现了不同层次的情境不确定性,展示了实际约束条件。我们进一步研究了多智能体强化学习策略——因子分解深度Q网络在最小化伤员标记时间方面相较于基准启发式算法的性能表现。大量仿真实验表明,在各类启发式方法中,采用局部通信的策略对自适应伤员标记更为高效,特别是选择最近伤员并支持重规划的方法。综合分析所有实验,我们发现FDQN方法在小规模场景中优于启发式算法,而启发式算法在更复杂场景中表现更佳。本实验涵盖多种复杂度场景,探索了MARL在实际应用中的能力上限,并揭示了关键见解。