Exploration in deep reinforcement learning (RL) is commonly implemented as temporally uncorrelated white noise. However, recent works show that temporally correlated colored noise can improve exploration efficiency by producing smooth trajectories with better coverage of the state space. We inquire whether action noise inspired by infant spontaneous movements can also improve exploration in deep RL. We find that the power spectral densities of babies' end-effector velocities follow a colored noise process where the spectral exponent increases with age. Inspired by this developmental pattern, we introduce a mechanism that progressively increases the temporal auto-correlation of exploration noise during RL training, matching the infant statistics. Experiments across several RL environments show that infant-inspired noise produces structured exploratory behavior and can improve learning efficiency compared to conventional exploration strategies. These findings suggest that human motor and cognitive development can provide useful guidance for designing learning mechanisms in artificial agents. Our code is available at https://github.com/trieschlab/baby-noise-rl.
翻译:在深度强化学习中,探索过程通常采用时间上不相关的白噪声实现。然而,近期研究表明,时间上相关的彩色噪声能够通过生成平滑且覆盖状态空间更全面的运动轨迹,从而提高探索效率。本文探讨了受婴儿自发运动启发的动作噪声是否也能提升深度强化学习中的探索性能。研究发现,婴儿肢体末端速度的功率谱密度遵循随年龄增长其谱指数逐渐增大的彩色噪声过程。受这一发育规律的启发,我们提出了一种机制:在强化学习训练过程中逐步增大探索噪声的时间自相关性,使其与婴儿统计特征相匹配。在多个强化学习环境中的实验结果表明,受婴儿启发的噪声可产生结构化的探索行为,与常规探索策略相比能够提高学习效率。这些发现表明,人类运动与认知发展可为人工代理学习机制的设计提供有益指导。我们的代码已开源:https://github.com/trieschlab/baby-noise-rl