Popular artificial neural networks (ANN) optimize parameters for unidirectional value propagation, assuming some arbitrary parametrization type like Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN). In contrast, for biological neurons e.g. "it is not uncommon for axonal propagation of action potentials to happen in both directions"~\cite{axon} - suggesting they are optimized to continuously operate in multidirectional way. Additionally, statistical dependencies a single neuron could model is not just (expected) value dependence, but entire joint distributions including also higher moments. Such more agnostic joint distribution neuron would allow for multidirectional propagation (of distributions or values) e.g. $\rho(x|y,z)$ or $\rho(y,z|x)$ by substituting to $\rho(x,y,z)$ and normalizing. There will be discussed Hierarchical Correlation Reconstruction (HCR) for such neuron model: assuming $\rho(x,y,z)=\sum_{ijk} a_{ijk} f_i(x) f_j(y) f_k(z)$ type parametrization of joint distribution in polynomial basis $f_i$, which allows for flexible, inexpensive processing including nonlinearities, direct model estimation and update, trained through standard backpropagation or novel ways for such structure up to tensor decomposition or information bottleneck approach. Using only pairwise (input-output) dependencies, its expected value prediction becomes KAN-like with trained activation functions as polynomials, can be extended by adding higher order dependencies through included products - in conscious interpretable way, allowing for multidirectional propagation of both values and probability densities.
翻译:主流人工神经网络(ANN)针对单向值传播优化参数,通常采用多层感知机(MLP)或柯尔莫哥洛夫-阿诺德网络(KAN)等特定参数化形式。相比之下,生物神经元常表现出多向运作特性,例如“动作电位在轴突中双向传播并不罕见”~\cite{axon}——这表明生物神经元经优化可持续进行多向处理。此外,单个神经元所能建模的统计依赖关系不仅限于(期望)值依赖,还可涵盖包括高阶矩在内的完整联合分布。这种更具不可知性的联合分布神经元通过代入$\rho(x,y,z)$并归一化,可实现分布或值的多向传播(例如$\rho(x|y,z)$或$\rho(y,z|x)$)。本文将探讨适用于此类神经元模型的分层相关重构(HCR)方法:假设联合分布采用多项式基$f_i$中的$\rho(x,y,z)=\sum_{ijk} a_{ijk} f_i(x) f_j(y) f_k(z)$型参数化,该形式支持灵活、低开销的非线性处理,可直接进行模型估计与更新,并可通过标准反向传播或针对该结构的创新方法(直至张量分解或信息瓶颈方法)进行训练。仅利用成对(输入-输出)依赖关系时,其期望值预测会呈现类KAN特性,且经训练的多项式激活函数可通过引入乘积项以可解释方式扩展高阶依赖关系,从而实现数值与概率密度的多向传播。