Game development is a long process that involves many stages before a product is ready for the market. Human play testing is among the most time consuming, as testers are required to repeatedly perform tasks in the search for errors in the code. Therefore, automated testing is seen as a key technology for the gaming industry, as it would dramatically improve development costs and efficiency. Toward this end, we propose EVOLUTE, a novel imitation learning-based architecture that combines behavioural cloning (BC) with energy based models (EBMs). EVOLUTE is a two-stream ensemble model that splits the action space of autonomous agents into continuous and discrete tasks. The EBM stream handles the continuous tasks, to have a more refined and adaptive control, while the BC stream handles discrete actions, to ease training. We evaluate the performance of EVOLUTE in a shooting-and-driving game, where the agent is required to navigate and continuously identify targets to attack. The proposed model has higher generalisation capabilities than standard BC approaches, showing a wider range of behaviours and higher performances. Also, EVOLUTE is easier to train than a pure end-to-end EBM model, as discrete tasks can be quite sparse in the dataset and cause model training to explore a much wider set of possible actions while training.
翻译:游戏开发是一个漫长的过程,在产品上市前需经过多个阶段。其中,人类玩家测试最为耗时,因为测试人员需反复执行任务以寻找代码中的错误。因此,自动化测试被视为游戏行业的关键技术,有望显著降低开发成本并提升效率。为此,我们提出EVOLUTE——一种融合行为克隆与能量模型的新型模仿学习架构。EVOLUTE是一种双流集成模型,将自主智能体的动作空间划分为连续任务与离散任务:能量模型流处理连续任务以实现更精细的自适应控制,而行为克隆流负责离散任务以简化训练过程。我们在一个射击驾驶游戏中对EVOLUTE的性能进行了评估,智能体需在此游戏中导航并持续识别攻击目标。与标准行为克隆方法相比,所提模型展现出更强的泛化能力,行为范围更广且性能更优。此外,相较于纯端到端能量模型,EVOLUTE更易训练,因为离散任务在数据集中可能较为稀疏,导致纯能量模型在训练时需要探索更广泛的可能动作空间。