The ability of a brain or a neural network to efficiently learn depends crucially on both the task structure and the learning rule. Previous works have analyzed the dynamical equations describing learning in the relatively simplified context of the perceptron under assumptions of a student-teacher framework or a linearized output. While these assumptions have facilitated theoretical understanding, they have precluded a detailed understanding of the roles of the nonlinearity and input-data distribution in determining the learning dynamics, limiting the applicability of the theories to real biological or artificial neural networks. Here, we use a stochastic-process approach to derive flow equations describing learning, applying this framework to the case of a nonlinear perceptron performing binary classification. We characterize the effects of the learning rule (supervised or reinforcement learning, SL/RL) and input-data distribution on the perceptron's learning curve and the forgetting curve as subsequent tasks are learned. In particular, we find that the input-data noise differently affects the learning speed under SL vs. RL, as well as determines how quickly learning of a task is overwritten by subsequent learning. Additionally, we verify our approach with real data using the MNIST dataset. This approach points a way toward analyzing learning dynamics for more-complex circuit architectures.
翻译:大脑或神经网络高效学习的能力关键取决于任务结构和学习规则。先前研究在相对简化的感知机背景下分析了描述学习过程的动力学方程,这些分析基于师生框架或线性化输出的假设。虽然这些假设促进了理论理解,但它们阻碍了对非线性和输入数据分布在决定学习动力学中作用的深入理解,限制了这些理论在真实生物或人工神经网络中的适用性。本文采用随机过程方法推导描述学习过程的流方程,并将该框架应用于执行二元分类的非线性感知机。我们系统分析了学习规则(监督学习与强化学习,SL/RL)和输入数据分布对感知机学习曲线以及后续任务学习过程中遗忘曲线的影响。特别发现,输入数据噪声对SL与RL学习速度的影响存在差异,同时决定了任务学习被后续学习覆盖的速度。此外,我们使用MNIST数据集通过真实数据验证了该方法。该研究为分析更复杂电路架构的学习动力学指明了方向。