Recurrent off-policy deep reinforcement learning models achieve state-of-the-art performance but are often sidelined due to their high computational demands. In response, we introduce RISE (Recurrent Integration via Simplified Encodings), a novel approach that can leverage recurrent networks in any image-based off-policy RL setting without significant computational overheads via using both learnable and non-learnable encoder layers. When integrating RISE into leading non-recurrent off-policy RL algorithms, we observe a 35.6% human-normalized interquartile mean (IQM) performance improvement across the Atari benchmark. We analyze various implementation strategies to highlight the versatility and potential of our proposed framework.
翻译:循环离策略深度强化学习模型虽能实现最先进的性能,却常因其高计算需求而被搁置。为此,我们提出RISE(通过简化编码的循环集成),这是一种新颖的方法,通过同时使用可学习和不可学习的编码器层,能够在任何基于图像的离策略强化学习场景中利用循环网络,而无需显著增加计算开销。将RISE集成到领先的非循环离策略强化学习算法中时,我们在Atari基准测试中观察到35.6%的人类标准化四分位间均值性能提升。我们分析了多种实现策略,以突显所提出框架的通用性和潜力。