Reinforcement learning (RL) has achieved notable performance in high-dimensional sequential decision-making tasks, yet remains limited by low sample efficiency, sensitivity to noise, and weak generalization under partial observability. Most existing approaches address these issues primarily through optimization strategies, while the role of architectural priors in shaping representation learning and decision dynamics is less explored. Inspired by structural principles of the cerebellum, we propose a biologically grounded RL architecture that incorporate large expansion, sparse connectivity, sparse activation, and dendritic-level modulation. Experiments on noisy, high-dimensional RL benchmarks show that both the cerebellar architecture and dendritic modulation consistently improve sample efficiency, robustness, and generalization compared to conventional designs. Sensitivity analysis of architectural parameters suggests that cerebellum-inspired structures can offer optimized performance for RL with constrained model parameters. Overall, our work underscores the value of cerebellar structural priors as effective inductive biases for RL.
翻译:强化学习(RL)在高维序列决策任务中已取得显著性能,但仍受限于样本效率低、对噪声敏感以及在部分可观测性下泛化能力弱等问题。现有方法主要通过优化策略应对这些挑战,而架构先验在塑造表示学习与决策动态中的作用较少被探索。受小脑结构原理启发,我们提出一种基于生物学的强化学习架构,该架构融合了大尺度扩展、稀疏连接、稀疏激活以及树突层级的调控机制。在噪声高维强化学习基准测试上的实验表明,与传统设计相比,小脑架构与树突调控机制均能持续提升样本效率、鲁棒性与泛化能力。对架构参数的敏感性分析表明,受小脑启发的结构能够在有限模型参数下为强化学习提供优化性能。总体而言,我们的工作强调了小脑结构先验作为强化学习有效归纳偏置的价值。