We introduce a model for multi-agent interaction problems to understand how a heterogeneous team of agents should organize its resources to tackle a heterogeneous team of attackers. This model is inspired by how the human immune system tackles a diverse set of pathogens. The key property of this model is "cross-reactivity" which enables a particular defender type to respond strongly to some attackers but weakly to a few different types of attackers. Due to this, the optimal defender distribution that minimizes the harm incurred by attackers is supported on a discrete set. This allows the defender team to allocate resources to a few types and yet tackle a large number of attacker types. We study this model in different settings to characterize a set of guiding principles for control problems with heterogeneous teams of agents, e.g., sensitivity of the harm to sub-optimal defender distributions, teams consisting of a small number of attackers and defenders, estimating and tackling an evolving attacker distribution, and competition between defenders that gives near-optimal behavior using decentralized computation of the control. We also compare this model with reinforcement-learned policies for the defender team.
翻译:我们提出了一种用于多智能体交互问题的模型,旨在理解异构智能体团队应如何组织其资源以应对异构攻击者团队。该模型受人体免疫系统应对多种病原体方式的启发。模型的关键特性是“交叉反应性”,即特定防御者类型能对某些攻击者产生强烈响应,但对少数不同类型的攻击者反应较弱。基于此,能最小化攻击者所造成损害的最优防御者分布仅支撑于离散集合。这使得防御者团队可将资源集中于少数类型,却能应对大量攻击者类型。我们在不同场景下研究该模型,以刻画异构智能体团队控制问题的一系列指导原则,例如:次优防御者分布对损害的敏感性、包含少量攻击者与防御者的团队、对动态演化的攻击者分布的估计与应对,以及通过分散式计算实现近优行为的防御者间竞争机制。此外,我们将该模型与基于强化学习的防御者策略进行了对比。