Biological cortical neurons are remarkably sophisticated computational devices, temporally integrating their vast synaptic input over an intricate dendritic tree, subject to complex, nonlinearly interacting internal biological processes. A recent study proposed to characterize this complexity by fitting accurate surrogate models to replicate the input-output relationship of a detailed biophysical cortical pyramidal neuron model and discovered it needed temporal convolutional networks (TCN) with millions of parameters. Requiring these many parameters, however, could stem from a misalignment between the inductive biases of the TCN and cortical neuron's computations. In light of this, and to explore the computational implications of leaky memory units and nonlinear dendritic processing, we introduce the Expressive Leaky Memory (ELM) neuron model, a biologically inspired phenomenological model of a cortical neuron. Remarkably, by exploiting such slowly decaying memory-like hidden states and two-layered nonlinear integration of synaptic input, our ELM neuron can accurately match the aforementioned input-output relationship with under ten thousand trainable parameters. To further assess the computational ramifications of our neuron design, we evaluate it on various tasks with demanding temporal structures, including the Long Range Arena (LRA) datasets, as well as a novel neuromorphic dataset based on the Spiking Heidelberg Digits dataset (SHD-Adding). Leveraging a larger number of memory units with sufficiently long timescales, and correspondingly sophisticated synaptic integration, the ELM neuron displays substantial long-range processing capabilities, reliably outperforming the classic Transformer or Chrono-LSTM architectures on LRA, and even solving the Pathfinder-X task with over 70% accuracy (16k context length).
翻译:生物皮层神经元是极为复杂的计算设备,能在复杂的树突树上对大量突触输入进行时间整合,并受非线性相互作用的内部生物过程调控。近期一项研究通过拟合精确的替代模型来复现详细生物物理皮层锥体神经元的输入-输出关系,发现需要数百万参数的时序卷积网络(TCN)才能实现。然而,如此庞大的参数需求可能源于TCN的归纳偏置与皮层神经元计算之间的不匹配。鉴于此,并为了探索漏忆单元与非线性树突处理的计算意义,我们提出表达性漏忆(ELM)神经元模型——一种受生物启发的皮层神经元现象学模型。值得注意的是,通过利用此类缓慢衰减的记忆类隐状态及突触输入的两层非线性整合,我们的ELM神经元能以不到一万个可训练参数精确匹配上述输入-输出关系。为进一步评估神经元设计的计算影响,我们在多种具有严苛时间结构的任务上对其进行测试,包括长程竞技场(LRA)数据集,以及基于脉冲海德堡数字数据集(SHD-Adding)的新型神经形态数据集。通过利用更大数量且具有足够时间尺度的记忆单元,并配合相应的复杂突触整合,ELM神经元展现出强大的长程处理能力,在LRA上稳定超越经典Transformer或Chrono-LSTM架构,甚至能以超过70%的准确率解决Pathfinder-X任务(上下文长度16k)。