Efficient neural networks are essential for scaling machine learning models to real-time applications and resource-constrained environments. Fully-connected feedforward layers (FFLs) introduce computation and parameter count bottlenecks within neural network architectures. To address this challenge, in this work, we propose a new class of dense layers that generalize standard fully-connected feedforward layers, \textbf{E}fficient, \textbf{U}nified and \textbf{Gen}eral dense layers (EUGens). EUGens leverage random features to approximate standard FFLs and go beyond them by incorporating a direct dependence on the input norms in their computations. The proposed layers unify existing efficient FFL extensions and improve efficiency by reducing inference complexity from quadratic to linear time. They also lead to \textbf{the first} unbiased algorithms approximating FFLs with arbitrary polynomial activation functions. Furthermore, EuGens reduce the parameter count and computational overhead while preserving the expressive power and adaptability of FFLs. We also present a layer-wise knowledge transfer technique that bypasses backpropagation, enabling efficient adaptation of EUGens to pre-trained models. Empirically, we observe that integrating EUGens into Transformers and MLPs yields substantial improvements in inference speed (up to \textbf{27}\%) and memory efficiency (up to \textbf{30}\%) across a range of tasks, including image classification, language model pre-training, and 3D scene reconstruction. Overall, our results highlight the potential of EUGens for the scalable deployment of large-scale neural networks in real-world scenarios.
翻译:高效神经网络对于将机器学习模型扩展至实时应用和资源受限环境至关重要。全连接前馈层在神经网络架构中引入了计算和参数量瓶颈。为应对这一挑战,本文提出了一类新型稠密层,其推广了标准全连接前馈层,即\textbf{高效}、\textbf{统一}且\textbf{通用}的稠密层(EUGens)。EUGens利用随机特征来逼近标准全连接前馈层,并通过在计算中引入对输入范数的直接依赖而超越了传统方法。所提出的层统一了现有的高效全连接前馈层扩展,并将推理复杂度从二次降低至线性,从而提升了效率。它们还实现了\textbf{首个}能以任意多项式激活函数逼近全连接前馈层的无偏算法。此外,EUGens在保持全连接前馈层表达能力和适应性的同时,减少了参数量和计算开销。我们还提出了一种无需反向传播的逐层知识迁移技术,使EUGens能够高效适配预训练模型。实验表明,将EUGens集成到Transformer和MLP中,在图像分类、语言模型预训练和3D场景重建等一系列任务上,推理速度(最高提升\textbf{27}\%)和内存效率(最高提升\textbf{30}\%)均获得显著改善。总体而言,我们的研究结果凸显了EUGens在大规模神经网络实际场景可扩展部署中的潜力。