Training AI with strong and rich strategies in multi-agent environments remains an important research topic in Deep Reinforcement Learning (DRL). The AI's strength is closely related to its diversity of strategies, and this relationship can guide us to train AI with both strong and rich strategies. To prove this point, we propose Diversity is Strength (DIS), a novel DRL training framework that can simultaneously train multiple kinds of AIs. These AIs are linked through an interconnected history model pool structure, which enhances their capabilities and strategy diversities. We also design a model evaluation and screening scheme to select the best models to enrich the model pool and obtain the final AI. The proposed training method provides diverse, generalizable, and strong AI strategies without using human data. We tested our method in an AI competition based on Google Research Football (GRF) and won the 5v5 and 11v11 tracks. The method enables a GRF AI to have a high level on both 5v5 and 11v11 tracks for the first time, which are under complex multi-agent environments. The behavior analysis shows that the trained AI has rich strategies, and the ablation experiments proved that the designed modules benefit the training process.
翻译:在多智能体环境中训练具备强大且丰富策略的人工智能,仍是深度强化学习领域的重要研究方向。AI的策略多样性与其能力强度密切相关,这种关系可指导我们训练兼具强大性能与丰富策略的AI。为验证这一观点,我们提出多样性即力量(DIS)框架——一种新颖的深度强化学习训练框架,能够同时训练多种类型的AI。这些AI通过互联的历史模型池结构相互关联,从而增强各自的能力与策略多样性。我们还设计了模型评估与筛选方案,用于精选最优模型以丰富模型池并最终获得高性能AI。该方法无需依赖人类数据即可提供多样化、可泛化且强大的AI策略。我们在基于Google Research Football (GRF)的AI竞赛中测试了该方法,并赢得了5v5和11v11赛道的胜利。该技术首次使GRF AI在复杂多智能体环境下的5v5和11v11赛道同时达到顶尖水平。行为分析表明训练后的AI具备丰富策略,消融实验证实各设计模块对训练过程均有积极贡献。