Tomorrow's robots will need to distinguish useful information from noise when performing different tasks. A household robot for instance may continuously receive a plethora of information about the home, but needs to focus on just a small subset to successfully execute its current chore. Filtering distracting inputs that contain irrelevant data has received little attention in the reinforcement learning literature. To start resolving this, we formulate a problem setting in reinforcement learning called the $\textit{extremely noisy environment}$ (ENE), where up to $99\%$ of the input features are pure noise. Agents need to detect which features provide task-relevant information about the state of the environment. Consequently, we propose a new method termed $\textit{Automatic Noise Filtering}$ (ANF), which uses the principles of dynamic sparse training in synergy with various deep reinforcement learning algorithms. The sparse input layer learns to focus its connectivity on task-relevant features, such that ANF-SAC and ANF-TD3 outperform standard SAC and TD3 by a large margin, while using up to $95\%$ fewer weights. Furthermore, we devise a transfer learning setting for ENEs, by permuting all features of the environment after 1M timesteps to simulate the fact that other information sources can become relevant as the world evolves. Again, ANF surpasses the baselines in final performance and sample complexity. Our code is available at https://github.com/bramgrooten/automatic-noise-filtering
翻译:摘要:未来的机器人需要在执行不同任务时从噪声中区分有用信息。例如,家庭机器人可能持续接收关于家庭的大量信息,但需要仅关注其中一小部分才能成功完成当前任务。在强化学习文献中,过滤包含无关数据的干扰输入一直鲜受关注。为解决这一问题,我们首先定义了一个名为“极端噪声环境”(ENE)的强化学习问题设定,其中高达99%的输入特征为纯噪声。智能体需要检测哪些特征提供了与任务相关的环境状态信息。为此,我们提出了一种名为“自动噪声过滤”(ANF)的新方法,该方法将动态稀疏训练的原理与多种深度强化学习算法协同使用。稀疏输入层学会将其连接集中于任务相关特征,使得ANF-SAC和ANF-TD3在性能上大幅超越标准SAC和TD3,同时使用最多减少95%的权重。此外,我们为ENE设计了一种迁移学习设定,通过在1M时间步后置换环境中的所有特征,模拟了随着世界演变其他信息源可能变得相关的情况。ANF在最终性能和样本复杂度上同样超越了基线。我们的代码可在https://github.com/bramgrooten/automatic-noise-filtering获取。