Recently a million of biological neurons (BNN) has turned out better from modern RL methods in playing Pong~\cite{RL}, reminding they are still qualitatively superior e.g. in learning, flexibility and robustness - suggesting to try to improve current artificial e.g. MLP/KAN for better agreement with biological. There is proposed extension of KAN approach to neurons containing model of local joint distribution: $ρ(\mathbf{x})=\sum_{\mathbf{j}\in B} a_\mathbf{j} f_\mathbf{j}(\mathbf{x})$ for $\mathbf{x} \in [0,1]^d$, adding interpretation and information flow control to KAN, and allowing to gradually add missing 3 basic properties of biological: 1) biological axons propagate in both directions~\cite{axon}, while current artificial are focused on unidirectional propagation - joint distribution neurons can repair by substituting some variables to get conditional values/distributions for the remaining. 2) Animals show risk avoidance~\cite{risk} requiring to process variance, and generally real world rather needs probabilistic models - the proposed can predict and propagate also distributions as vectors of moments: (expected value, variance) or higher. 3) biological neurons require local training, and beside backpropagation, the proposed allows many additional ways, like direct training, through tensor decomposition, or finally local and promising: information bottleneck. Proposed approach is very general, can be also used as extension of softmax in embeddings of e.g. transformer, JEPA, Mamba, suggesting interpretation that features are mixed moments of joint density of real-world properties.
翻译:近期研究表明,在《打砖块》游戏任务中,百万级生物神经网络(BNN)的表现已超越现代强化学习方法,凸显其在学习能力、灵活性和鲁棒性等方面仍具有质的优势——这启示我们应当改进现有的人工神经网络(如MLP/KAN)以提升其生物一致性。本文提出对KAN方法的扩展方案,通过引入局部联合分布模型神经元:$ρ(\mathbf{x})=\sum_{\mathbf{j}\in B} a_\mathbf{j} f_\mathbf{j}(\mathbf{x})$($\mathbf{x} \in [0,1]^d$),在保持KAN框架的基础上增加可解释性与信息流控制能力,并逐步补全生物神经元的三个缺失基本属性:1)生物轴突具有双向传播特性,而当前人工神经元聚焦于单向传播——联合分布神经元可通过变量替换获取剩余变量的条件值/分布实现双向修复;2)动物展现的风险规避行为需要处理方差,现实世界更需要概率模型——本方法可预测并传播以矩向量形式(期望值、方差或更高阶矩)表达的分布;3)生物神经元需要局部训练,除反向传播外,本方法允许更多训练方式,包括直接训练、张量分解训练,以及最具前景的局部信息瓶颈训练。该框架具有高度通用性,可作为Transformer、JEPA、Mamba等模型中嵌入层的softmax扩展,其核心思想在于将特征解释为现实世界属性的联合密度混合矩。