Attention improves representation learning over RNNs, but its discrete nature limits continuous-time (CT) modeling. We introduce Neuronal Attention Circuit (NAC), a novel, biologically inspired CT-Attention mechanism that reformulates attention logit computation as the solution to a linear first-order ODE with nonlinear interlinked gates derived from repurposing C.elegans Neuronal Circuit Policies (NCPs) wiring. NAC replaces dense projections with sparse sensory gates for key-query projections and a sparse backbone network with two heads for computing content-target and learnable time-constant gates, enabling efficient adaptive dynamics. To improve efficiency and memory consumption, we implemented an adaptable subquadratic sparse Top-K pairwise concatenation mechanism that selectively curates key-query interactions. We provide rigorous theoretical guarantees, including state stability and bounded approximation errors. Empirically, we implemented NAC in diverse domains, including irregular time-series classification, lane-keeping for autonomous vehicles, and industrial prognostics. We observed that NAC matches or outperforms competing baselines in accuracy and occupies an intermediate position in runtime and memory consumption compared with several CT state-of-the-art baselines, while being interpretable at the neuron cell level.
翻译:注意力机制改进了基于循环神经网络(RNN)的表征学习,但其离散特性限制了连续时间(CT)建模。我们提出了神经元注意力电路(NAC),这是一种新颖的、受生物学启发的连续时间注意力机制。它将注意力对数计算重新表述为一个线性一阶常微分方程的解,该方程的非线性互连门源自于重新利用秀丽隐杆线虫神经元电路策略(NCPs)的布线结构。NAC使用稀疏的感觉门代替密集投影来进行键-查询投影,并采用一个具有两个头部的稀疏骨干网络来计算内容-目标门和可学习的时间常数门,从而实现高效的自适应动态。为了提高效率和减少内存消耗,我们实现了一种可适应的次二次稀疏Top-K成对连接机制,该机制有选择地管理键-查询交互。我们提供了严格的理论保证,包括状态稳定性和有界近似误差。在实证方面,我们在多个领域实现了NAC,包括不规则时间序列分类、自动驾驶车辆的车道保持以及工业预测。我们观察到,在准确性方面,NAC与竞争基线模型相当或更优;与几种最先进的连续时间基线模型相比,其在运行时间和内存消耗方面处于中间位置,同时在神经元细胞层面具有可解释性。