Do you remember your first video game console? We remember ours. Decades ago, they provided hours of entertainment. Now, we have repurposed them to solve dynamic and stochastic optimization problems. With deep reinforcement learning methods posting superhuman performance on a wide range of Atari games, we consider the task of representing a classic logistics problem as a game. Then, we train agents to play it. We consider several game designs for the vehicle routing problem with stochastic requests. We show how various design features impact agents' performance, including perspective, field of view, and minimaps. With the right game design, general purpose Atari agents outperform optimization-based benchmarks, especially as problem size grows. Our work points to the representation of dynamic and stochastic optimization problems via games as a promising research direction.
翻译:你还记得你的第一台视频游戏机吗?我们还记得自己的。数十年前,它们提供了数小时的娱乐。如今,我们将其重新用于解决动态随机优化问题。随着深度强化学习方法在众多Atari游戏上展现出超人类的表现,我们考虑将经典物流问题表示为游戏的任务,并训练智能体进行游戏。针对随机请求车辆路由问题,我们提出了多种游戏设计方案。我们展示了不同设计特征(包括视角、视野范围和迷你地图)如何影响智能体的性能。通过恰当的游戏设计,通用Atari智能体能够超越基于优化的基准方法,尤其是在问题规模增大时。我们的工作表明,通过游戏来表示动态随机优化问题是一个具有前景的研究方向。