The brain learns abstract representations of high-dimensional sensory input, but the plasticity rules that enable such learning are unknown. We study biologically plausible algorithms on the Random Hierarchy Model (RHM), an artificial dataset designed to investigate how deep neural networks learn the intrinsic hierarchical structure of high-dimensional data. We focus on two types of local learning rules that avoid both a long convergence time and the use of a symmetric error network. The first type uses direct feedback signals to approximate error propagation from the output layer. The second type uses layerwise self-supervised contrastive or non-contrastive loss functions that do not explicitly approximate errors at the output layer. We show that all rules of the first type fail to solve the tasks of the RHM and trace this failure back to input-specific nonlinearities (`masking') that are implemented in full backpropagation and are essential for learning complex tasks. However, algorithms of the second type are able to learn the hierarchical hidden structure of the RHM tasks and are as data-efficient as supervised backpropagation training, while being compatible with known rules of synaptic plasticity in cortex.
翻译:大脑学习高维感官输入的抽象表征,但实现这种学习的可塑性规则尚不明确。我们在随机层次模型(Random Hierarchy Model, RHM)这一人工数据集上研究生物合理的算法,该数据集旨在探究深度神经网络如何学习高维数据的内在层次结构。我们聚焦于两类避免长收敛时间且不使用对称误差网络的局部学习规则:第一类利用直接反馈信号近似输出层的误差传播;第二类采用逐层自监督对比或非对比损失函数,无需显式近似输出层误差。研究表明,第一类所有规则均无法解决RHM任务,并将此失败归因于全反向传播中实现的输入特异性非线性("掩蔽")——这是学习复杂任务的关键机制。然而,第二类算法能够学习RHM任务的层次隐藏结构,其数据效率与监督式反向传播训练相当,同时与皮层已知的突触可塑性规则兼容。