Modern deep neural networks rely on Euclidean scalar activations (e.g., ReLU) and global normalization techniques (e.g., LayerNorm) to prevent gradient instability in deep architectures. However, these mechanisms inherently cause dead neurons, discard critical directional information, and destroy the orthogonality of feature representations. Inspired by the frequency-modulation transmission of biological axons, we propose the Z-Plane Neural Network, which maps hidden states into 2D phasor bundles on a hypersphere. We introduce a novel geometric activation function, Radial Bounding($\mathbf{x} / \max(1, \|\mathbf{x}\|_2)$), which limits the energy magnitude while preserving the phase (direction). We demonstrate mathematically that this isotropic activation maintains 1-Lipschitz continuity and prevents gradient vanishing by preserving tangential gradients. Empirically, a 100-layer Z-Plane Multi-Layer Perceptron (MLP)-entirely devoid of ReLU and LayerNorm-successfully converges on the MNIST dataset with 98.34% accuracy and absolute numerical stability, proving that bounded geometric activation alone is sufficient for stable deep learning.
翻译:现代深度神经网络依赖欧几里得标量激活函数(如ReLU)和全局归一化技术(如层归一化)来防止深层架构中的梯度不稳定性。然而,这些机制本质会导致神经元死亡、丢弃关键的方向信息,并破坏特征表示的正交性。受生物轴突调频传输的启发,我们提出Z平面神经网络,该网络将隐状态映射到超球面上的二维相量束。我们引入一种新颖的几何激活函数——径向有界函数($\mathbf{x} / \max(1, \|\mathbf{x}\|_2)$),它在保留相位(方向)的同时限制能量幅度。我们数学上证明这种各向同性激活函数通过保留切向梯度来维持1-利普希茨连续性并防止梯度消失。实验表明,完全摒弃ReLU和层归一化的100层Z平面多层感知机(MLP)在MNIST数据集上成功收敛,达到98.34%的准确率并具有绝对数值稳定性,证明仅需有界几何激活函数即可实现稳定的深度学习。