Sparse neural networks with skip-connections for identification of aluminum electrolysis cell

Neural networks are rapidly gaining interest in nonlinear system identification due to the model's ability to capture complex input-output relations directly from data. However, despite the flexibility of the approach, there are still concerns about the safety of these models in this context, as well as the need for large amounts of potentially expensive data. Aluminum electrolysis is a highly nonlinear production process, and most of the data must be sampled manually, making the sampling process expensive and infrequent. In the case of infrequent measurements of state variables, the accuracy and open-loop stability of the long-term predictions become highly important. Standard neural networks struggle to provide stable long-term predictions with limited training data. In this work, we investigate the effect of combining concatenated skip-connections and the sparsity-promoting $\ell_1$ regularization on the open-loop stability and accuracy of forecasts with short, medium, and long prediction horizons. The case study is conducted on a high-dimensional and nonlinear simulator representing an aluminum electrolysis cell's mass and energy balance. The proposed model structure contains concatenated skip connections from the input layer and all intermittent layers to the output layer, referred to as InputSkip. $\ell_1$ regularized InputSkip is called sparse InputSkip. The results show that sparse InputSkip outperforms dense and sparse standard feedforward neural networks and dense InputSkip regarding open-loop stability and long-term predictive accuracy. The results are significant when models are trained on datasets of all sizes (small, medium, and large training sets) and for all prediction horizons (short, medium, and long prediction horizons.)

翻译：神经网络因能够直接从数据中捕捉复杂的输入输出关系，在非线性系统辨识领域迅速受到关注。然而，尽管该方法具有灵活性，但在此背景下仍存在对模型安全性的担忧，同时需要大量可能昂贵的数据获取。铝电解是一种高度非线性的生产过程，大部分数据需手动采样，导致采样过程成本高昂且频率低下。在状态变量测量稀疏的情况下，长期预测的精度与开环稳定性变得尤为重要。标准神经网络难以在有限训练数据下提供稳定的长期预测。本研究探究了级联跳跃连接与促进稀疏性的ℓ1正则化相结合对开环稳定性及短、中、长预测时域精度的影响。案例研究基于一个表征铝电解槽质量与能量平衡的高维非线性模拟器。所提出的模型结构包含从输入层及所有中间层到输出层的级联跳跃连接，称为InputSkip。经ℓ1正则化的InputSkip被称为稀疏InputSkip。结果表明，在开环稳定性与长期预测精度方面，稀疏InputSkip优于密集及稀疏的标准前馈神经网络以及密集InputSkip。该结果在各类规模数据集（小、中、大训练集）及所有预测时域（短、中、长预测时域）上训练的模型中均具有显著性。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日