A successful tactic that is followed by the scientific community for advancing AI is to treat games as problems, which has been proven to lead to various breakthroughs. We adapt this strategy in order to study Rocket League, a widely popular but rather under-explored 3D multiplayer video game with a distinct physics engine and complex dynamics that pose a significant challenge in developing efficient and high-performance game-playing agents. In this paper, we present Lucy-SKG, a Reinforcement Learning-based model that learned how to play Rocket League in a sample-efficient manner, outperforming by a notable margin the two highest-ranking bots in this game, namely Necto (2022 bot champion) and its successor Nexto, thus becoming a state-of-the-art agent. Our contributions include: a) the development of a reward analysis and visualization library, b) novel parameterizable reward shape functions that capture the utility of complex reward types via our proposed Kinesthetic Reward Combination (KRC) technique, and c) design of auxiliary neural architectures for training on reward prediction and state representation tasks in an on-policy fashion for enhanced efficiency in learning speed and performance. By performing thorough ablation studies for each component of Lucy-SKG, we showed their independent effectiveness in overall performance. In doing so, we demonstrate the prospects and challenges of using sample-efficient Reinforcement Learning techniques for controlling complex dynamical systems under competitive team-based multiplayer conditions.
翻译:科学界推进人工智能的一个成功策略是将游戏视为问题,这已被证明能带来各种突破。我们采用这一策略来研究《火箭联盟》——一款广受欢迎但探索相对不足的3D多人视频游戏,其独特的物理引擎和复杂动力学为开发高效且高性能的游戏智能体带来了重大挑战。本文提出了Lucy-SKG,一种基于强化学习的模型,它以样本高效的方式学会了玩《火箭联盟》,并显著超越了该游戏中排名最高的两个机器人——Necto(2022年机器人冠军)及其后继者Nexto,从而成为当前最优的智能体。我们的贡献包括:a) 开发了一个奖励分析与可视化库;b) 通过我们提出的运动奖励组合(KRC)技术,设计了新颖的可参数化奖励形状函数,以捕捉复杂奖励类型的效用;c) 设计了辅助神经网络架构,用于以在线策略方式训练奖励预测和状态表征任务,以提升学习速度和性能的效率。通过对Lucy-SKG每个组件进行全面的消融研究,我们证明了它们对整体性能的独立有效性。通过这项工作,我们展示了在竞争性团队多人条件下,使用样本高效强化学习技术控制复杂动力系统的前景与挑战。