Scripted agents have predominantly won the five previous iterations of the IEEE microRTS ($\mu$RTS) competitions hosted at CIG and CoG. Despite Deep Reinforcement Learning (DRL) algorithms making significant strides in real-time strategy (RTS) games, their adoption in this primarily academic competition has been limited due to the considerable training resources required and the complexity inherent in creating and debugging such agents. RAISocketAI is the first DRL agent to win the IEEE microRTS competition. In a benchmark without performance constraints, RAISocketAI regularly defeated the two prior competition winners. This first competition-winning DRL submission can be a benchmark for future microRTS competitions and a starting point for future DRL research. Iteratively fine-tuning the base policy and transfer learning to specific maps were critical to RAISocketAI's winning performance. These strategies can be used to economically train future DRL agents. Further work in Imitation Learning using Behavior Cloning and fine-tuning these models with DRL has proven promising as an efficient way to bootstrap models with demonstrated, competitive behaviors.
翻译:脚本智能体在先前于CIG和CoG举办的五届IEEE微实时策略($\mu$RTS)竞赛中占据了主导胜利。尽管深度强化学习(DRL)算法在实时策略(RTS)游戏中取得了显著进展,但由于其训练所需资源庞大,且创建和调试此类智能体本身具有复杂性,它们在这一主要面向学术的竞赛中的应用一直有限。RAISocketAI是首个赢得IEEE微实时策略竞赛的DRL智能体。在一个无性能约束的基准测试中,RAISocketAI经常击败前两届的竞赛优胜者。这首次获胜的DRL参赛作品,可为未来的微实时策略竞赛提供一个基准,并为未来的DRL研究提供一个起点。对基础策略的迭代微调以及向特定地图的迁移学习,是RAISocketAI获胜表现的关键。这些策略可用于经济高效地训练未来的DRL智能体。进一步的研究表明,利用行为克隆进行模仿学习,并随后通过DRL对这些模型进行微调,是一种有前景的高效方法,能够引导模型获得经过验证的、具有竞争力的行为。