Exploration in deep reinforcement learning (RL) is commonly implemented as temporally uncorrelated white noise. However, recent works show that temporally correlated colored noise can improve exploration efficiency by producing smooth trajectories with better coverage of the state space. We inquire whether action noise inspired by infant spontaneous movements can also improve exploration in deep RL. We find that the power spectral densities of babies' end-effector velocities follow a colored noise process where the spectral exponent increases with age. Inspired by this developmental pattern, we introduce a mechanism that progressively increases the temporal auto-correlation of exploration noise during RL training, matching the infant statistics. Experiments across several RL environments show that infant-inspired noise produces structured exploratory behavior and can improve learning efficiency compared to conventional exploration strategies. These findings suggest that human motor and cognitive development can provide useful guidance for designing learning mechanisms in artificial agents. Our code is available at https://github.com/trieschlab/baby-noise-rl.
翻译:在深度强化学习(RL)中,探索通常通过时间上不相关的白噪声实现。然而,近期研究表明,时间上相关的有色噪声可通过生成更平滑的状态空间覆盖轨迹来提升探索效率。我们探究了受婴儿自发运动启发的动作噪声是否能同样改善深度强化学习中的探索性能。研究发现,婴儿末端执行器速度的功率谱密度遵循有色噪声过程,且谱指数随年龄增长而增加。受这一发育模式的启发,我们提出一种机制,在强化学习训练过程中逐步增加探索噪声的时间自相关性,使其与婴儿统计特性相匹配。在多个强化学习环境中的实验表明,婴儿启发式噪声能产生结构化的探索行为,相较于传统探索策略可提升学习效率。这些发现暗示,人类运动与认知发育可为人工代理学习机制设计提供有益指导。我们的代码开源在 https://github.com/trieschlab/baby-noise-rl。