Developing reliable mechanisms for continuous local learning is a central challenge faced by biological and artificial systems. Yet, how the environmental factors and structural constraints on the learning network influence the optimal plasticity mechanisms remains obscure even for simple settings. To elucidate these dependencies, we study meta-learning via evolutionary optimization of simple reward-modulated plasticity rules in embodied agents solving a foraging task. We show that unconstrained meta-learning leads to the emergence of diverse plasticity rules. However, regularization and bottlenecks to the model help reduce this variability, resulting in interpretable rules. Our findings indicate that the meta-learning of plasticity rules is very sensitive to various parameters, with this sensitivity possibly reflected in the learning rules found in biological networks. When included in models, these dependencies can be used to discover potential objective functions and details of biological learning via comparisons with experimental observations.
翻译:为持续局部学习开发可靠机制是生物与人工系统面临的核心挑战。然而,即使对于简单场景,环境因素与学习网络的结构约束如何影响最优可塑性机制仍不明确。为阐明这些依赖关系,我们通过进化优化受奖励调制简约可塑性规则,对执行觅食任务的具身智能体展开元学习研究。研究发现,无约束元学习会导致多种可塑性规则涌现,但模型的正则化与瓶颈约束有助于降低这种多样性,从而产生可解释规则。结果表明,可塑性规则的元学习对多种参数高度敏感,这种敏感性可能反映在生物网络中发现的学习规则中。当将此类依赖关系纳入模型时,可通过与实验观测的对比,揭示生物学学习的潜在目标函数与具体细节。