Biology-inspired joint distribution neurons based on Hierarchical Correlation Reconstruction allowing for multidirectional neural networks

Biological neural networks seem qualitatively superior (e.g. in learning, flexibility, robustness) from current artificial like Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN). Simultaneously, in contrast to them: have fundamentally multidirectional signal propagation~\cite{axon}, also of probability distributions e.g. for uncertainty estimation, and are believed not being able to use standard backpropagation training~\cite{backprop}. There are proposed novel artificial neurons based on HCR (Hierarchical Correlation Reconstruction) removing the above low level differences: with neurons containing local joint distribution model (of its connections), representing joint density on normalized variables as just linear combination among $(f_\mathbf{j})$ orthonormal polynomials: $\rho(\mathbf{x})=\sum_{\mathbf{j}\in B} a_\mathbf{j} f_\mathbf{j}(\mathbf{x})$ for $\mathbf{x} \in [0,1]^d$ and $B$ some chosen basis, with basis growth approaching complete description of joint distribution. By various index summations of such $(a_\mathbf{j})$ tensor as neuron parameters, we get simple formulas for e.g. conditional expected values for propagation in any direction, like $E[x|y,z]$, $E[y|x]$, which degenerate to KAN-like parametrization if restricting to pairwise dependencies. Such HCR network can also propagate probability distributions (also joint) like $\rho(y,z|x)$. It also allows for additional training approaches, like direct $(a_\mathbf{j})$ estimation, through tensor decomposition, or more biologically plausible information bottleneck training: layers directly influencing only neighbors, optimizing content to maximize information about the next layer, and minimizing about the previous to minimize the noise.

翻译：生物神经网络在诸多方面（如学习能力、灵活性和鲁棒性）似乎优于当前的人工神经网络，例如多层感知机（MLP）或柯尔莫哥洛夫-阿诺德网络（KAN）。同时，与后者形成鲜明对比的是：生物神经网络具有根本性的多向信号传播特性~\cite{axon}，并能传播概率分布（例如用于不确定性估计），且被认为无法使用标准的反向传播训练方法~\cite{backprop}。本文提出了一种基于HCR（分层相关重构）的新型人工神经元，旨在消除上述底层差异：神经元包含其连接的局部联合分布模型，将归一化变量上的联合密度表示为$(f_\mathbf{j})$正交多项式间的线性组合：$\rho(\mathbf{x})=\sum_{\mathbf{j}\in B} a_\mathbf{j} f_\mathbf{j}(\mathbf{x})$，其中$\mathbf{x} \in [0,1]^d$，$B$为选定的基，基的扩展可逼近联合分布的完整描述。通过对此作为神经元参数的$(a_\mathbf{j})$张量进行各种指标求和，我们可以得到用于任意方向传播的简单公式，例如条件期望值$E[x|y,z]$、$E[y|x]$；若限制为成对依赖关系，则退化为类似KAN的参数化形式。此类HCR网络也能传播概率分布（包括联合分布），如$\rho(y,z|x)$。它还支持额外的训练方法，例如通过张量分解直接估计$(a_\mathbf{j})$参数，或采用更具生物学合理性的信息瓶颈训练：各层仅直接影响相邻层，通过优化其内容以最大化关于下一层的信息，同时最小化关于上一层的信息以抑制噪声。