In this study, we investigate the performance of Deep Q-Networks utilizing Convolutional Neural Networks (CNNs) and Transformer architectures across three different Atari games. The advent of DQNs has significantly advanced Reinforcement Learning, enabling agents to directly learn optimal policies from high-dimensional sensory inputs from pixel or RAM data. While CNN-based DQNs have been extensively studied and deployed in various domains, Transformer-based DQNs are relatively unexplored. Our research aims to fill this gap by benchmarking the performance of both DCQNs and DTQNs across the Atari games Asteroids, Space Invaders, and Centipede. We find that in the 35-40 million parameter range, the DCQN outperforms the DTQN in speed across both ViT and Projection Architectures. We also find the DCQN outperforms the DTQN in all games except for Centipede.
翻译:本研究探讨了利用卷积神经网络和Transformer架构的深度Q网络在三款不同Atari游戏中的性能表现。DQN的出现极大地推动了强化学习的发展,使得智能体能够直接从像素或RAM数据等高维感官输入中学习最优策略。尽管基于CNN的DQN已在多个领域得到广泛研究和部署,但基于Transformer的DQN仍相对缺乏探索。我们的研究旨在通过对比DCQN和DTQN在Atari游戏《Asteroids》、《Space Invaders》和《Centipede》中的表现来填补这一空白。我们发现,在参数量为3500万至4000万的范围内,DCQN在ViT和Projection两种架构下的运行速度均优于DTQN。此外,除《Centipede》外,DCQN在所有游戏中的表现均优于DTQN。