The Rectified Power Unit (RePU) activation function, a differentiable generalization of the Rectified Linear Unit (ReLU), has shown promise in constructing neural networks due to its smoothness properties. However, deep RePU networks often suffer from critical issues such as vanishing or exploding values during training, rendering them unstable regardless of hyperparameter initialization. Leveraging the perspective of effective field theory, we identify the root causes of these failures and propose the Modified Rectified Power Unit (MRePU) activation function. MRePU addresses RePU's limitations while preserving its advantages, such as differentiability and universal approximation properties. Theoretical analysis demonstrates that MRePU satisfies criticality conditions necessary for stable training, placing it in a distinct universality class. Extensive experiments validate the effectiveness of MRePU, showing significant improvements in training stability and performance across various tasks, including polynomial regression, physics-informed neural networks (PINNs) and real-world vision tasks. Our findings highlight the potential of MRePU as a robust alternative for building deep neural networks.
翻译:修正幂单元激活函数作为修正线性单元的可微推广,因其光滑特性在构建神经网络方面展现出潜力。然而,深度RePU网络在训练过程中常出现数值消失或爆炸等关键问题,导致网络不稳定且与超参数初始化无关。基于有效场论视角,我们揭示了这些失效的根本原因,并提出改进型修正幂单元激活函数。MRePU在保留RePU优势(如可微性和通用逼近特性)的同时,解决了其局限性。理论分析表明MRePU满足稳定训练所需的关键性条件,使其归属于独特的普适性类别。大量实验验证了MRePU的有效性,在多项式回归、物理信息神经网络及实际视觉任务等多种任务中,均显示出训练稳定性和性能的显著提升。我们的研究结果凸显了MRePU作为构建深度神经网络鲁棒替代方案的潜力。