Neural network model compression techniques can address the computation issue of deep neural networks on embedded devices in industrial systems. The guaranteed output error computation problem for neural network compression with quantization is addressed in this paper. A merged neural network is built from a feedforward neural network and its quantized version to produce the exact output difference between two neural networks. Then, optimization-based methods and reachability analysis methods are applied to the merged neural network to compute the guaranteed quantization error. Finally, a numerical example is proposed to validate the applicability and effectiveness of the proposed approach.
翻译:神经网络模型压缩技术可解决工业系统中嵌入式设备上深度神经网络的计算问题。本文针对采用量化方法的神经网络压缩中的保证输出误差计算问题展开研究。通过融合前馈神经网络及其量化版本构建合并神经网络,以精确计算两个网络间的输出差异。随后,将基于优化的方法与可达性分析方法应用于合并神经网络,实现量化误差的保证计算。最后,通过数值算例验证所提方法的适用性与有效性。