The ability of a brain or a neural network to efficiently learn depends crucially on both the task structure and the learning rule. Previous works have analyzed the dynamical equations describing learning in the relatively simplified context of the perceptron under assumptions of a student-teacher framework or a linearized output. While these assumptions have facilitated theoretical understanding, they have precluded a detailed understanding of the roles of the nonlinearity and input-data distribution in determining the learning dynamics, limiting the applicability of the theories to real biological or artificial neural networks. Here, we use a stochastic-process approach to derive flow equations describing learning, applying this framework to the case of a nonlinear perceptron performing binary classification. We characterize the effects of the learning rule (supervised or reinforcement learning, SL/RL) and input-data distribution on the perceptron's learning curve and the forgetting curve as subsequent tasks are learned. In particular, we find that the input-data noise differently affects the learning speed under SL vs. RL, as well as determines how quickly learning of a task is overwritten by subsequent learning. Additionally, we verify our approach with real data using the MNIST dataset. This approach points a way toward analyzing learning dynamics for more-complex circuit architectures.
翻译:大脑或神经网络高效学习的能力,关键取决于任务结构和学习规则。先前的研究在相对简化的感知机背景下,基于师生框架或线性化输出的假设,分析了描述学习的动力学方程。虽然这些假设促进了理论理解,但它们阻碍了对非线性和输入数据分布在决定学习动力学中所起作用的深入理解,从而限制了这些理论在真实生物或人工神经网络中的适用性。本文采用随机过程方法推导描述学习的流方程,并将此框架应用于执行二元分类的非线性感知机。我们刻画了学习规则(监督学习或强化学习,SL/RL)和输入数据分布对感知机学习曲线以及在学习后续任务时遗忘曲线的影响。特别地,我们发现输入数据噪声对SL与RL下学习速度的影响不同,并且决定了任务学习被后续学习覆盖的速度。此外,我们使用MNIST数据集在真实数据上验证了我们的方法。此方法为分析更复杂电路架构的学习动力学指明了一条途径。