Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations. Despite the radical differences between biological and artificial networks, critical learning periods have been empirically observed in both systems. This suggests that critical periods may be fundamental to learning and not an accident of biology. Yet, why exactly critical periods emerge in deep networks is still an open question, and in particular it is unclear whether the critical periods observed in both systems depend on particular architectural or optimization details. To isolate the key underlying factors, we focus on deep linear network models, and show that, surprisingly, such networks also display much of the behavior seen in biology and artificial networks, while being amenable to analytical treatment. We show that critical periods depend on the depth of the model and structure of the data distribution. We also show analytically and in simulations that the learning of features is tied to competition between sources. Finally, we extend our analysis to multi-task learning to show that pre-training on certain tasks can damage the transfer performance on new tasks, and show how this depends on the relationship between tasks and the duration of the pre-training stage. To the best of our knowledge, our work provides the first analytically tractable model that sheds light into why critical learning periods emerge in biological and artificial networks.
翻译:关键学习期是发育早期的特定阶段,在此阶段短暂的感觉缺陷会对行为和学习表征产生永久影响。尽管生物网络与人工网络存在根本差异,但两类系统中均通过实验观察到了关键学习期。这表明关键期可能是学习的基本特性,而非生物学中的偶然现象。然而,深度网络中关键期产生的确切原因仍是一个开放性问题,尤其尚不明确两类系统中观察到的关键期是否依赖于特定的架构或优化细节。为隔离关键底层因素,我们聚焦深度线性网络模型,并惊人地发现此类网络在保持可解析处理特性的同时,也展现出生物学与人工网络中观察到的大部分行为。我们证明关键期取决于模型深度与数据分布结构。通过解析分析与仿真实验,我们表明特征学习与源之间的竞争相关联。最后,我们将分析扩展到多任务学习领域,证明在某些任务上的预训练可能损害新任务的迁移性能,并揭示这一现象如何取决于任务间关系与预训练阶段时长。据我们所知,本研究首次提供了可解析的模型框架,为解释生物网络与人工网络中关键学习期的产生机制提供了理论依据。