Optimizing Dense Feed-Forward Neural Networks

Deep learning models have been widely used during the last decade due to their outstanding learning and abstraction capacities. However, one of the main challenges any scientist has to face using deep learning models is to establish the network's architecture. Due to this difficulty, data scientists usually build over complex models and, as a result, most of them result computationally intensive and impose a large memory footprint, generating huge costs, contributing to climate change and hindering their use in computational-limited devices. In this paper, we propose a novel feed-forward neural network constructing method based on pruning and transfer learning. Its performance has been thoroughly assessed in classification and regression problems. Without any accuracy loss, our approach can compress the number of parameters by more than 70%. Even further, choosing the pruning parameter carefully, most of the refined models outperform original ones. We also evaluate the transfer learning level comparing the refined model and the original one training from scratch a neural network with the same hyper parameters as the optimized model. The results obtained show that our constructing method not only helps in the design of more efficient models but also more effective ones.

翻译：深度学习模型因其卓越的学习与抽象能力，在过去十年中被广泛应用。然而，科学家在采用深度学习模型时面临的主要挑战之一是如何确定网络架构。受此难点影响，数据科学家通常构建过于复杂的模型，导致大多数模型计算密集且内存占用巨大，由此产生高昂成本、加剧气候变化，并阻碍其在计算受限设备上的应用。本文提出一种基于剪枝与迁移学习的新型前馈神经网络构建方法。该方法在分类与回归问题中得到了全面评估。在不损失任何精度的前提下，我们的方法可将参数数量压缩70%以上。进一步，通过谨慎选择剪枝参数，大多数优化后的模型性能优于原始模型。我们还通过将优化模型与具有相同超参数的全新训练神经网络进行对比，评估了迁移学习的效果。结果表明，我们的构建方法不仅有助于设计更高效的模型，还能提升模型的有效性。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日