With the long-term goal of reducing the image processing time on an autonomous mobile robot in mind we explore in this paper the use of log-polar like image data with gaze control. The gaze control is not done on the Cartesian image but on the log-polar like image data. For this we start out from the classic deep reinforcement learning approach for Atari games. We extend an A3C deep RL approach with an LSTM network, and we learn the policy for playing three Atari games and a policy for gaze control. While the Atari games already use low-resolution images of 80 by 80 pixels, we are able to further reduce the amount of image pixels by a factor of 5 without losing any gaming performance.
翻译:以降低自主移动机器人图像处理时间为长期目标,本文探索了结合注视控制的对数极坐标类图像数据的使用方式。注视控制并非基于笛卡尔图像,而是直接作用于对数极坐标类图像数据。为此,我们从经典的Atari游戏深度强化学习方法出发,通过将A3C深度强化学习框架扩展为LSTM网络,同时学习三种Atari游戏的博弈策略与注视控制策略。尽管Atari原游戏已采用80×80像素的低分辨率图像,我们仍可在不损失任何游戏性能的前提下,将图像像素量进一步缩减至原来的五分之一。