Many recent pattern recognition applications rely on complex distributed architectures in which sensing and computational nodes interact together through a communication network. Deep neural networks (DNNs) play an important role in this scenario, furnishing powerful decision mechanisms, at the price of a high computational effort. Consequently, powerful state-of-the-art DNNs are frequently split over various computational nodes, e.g., a first part stays on an embedded device and the rest on a server. Deciding where to split a DNN is a challenge in itself, making the design of deep learning applications even more complicated. Therefore, we propose Split-Et-Impera, a novel and practical framework that i) determines the set of the best-split points of a neural network based on deep network interpretability principles without performing a tedious try-and-test approach, ii) performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements, and iii) suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time.
翻译:近期许多模式识别应用依赖于复杂的分布式架构,其中传感节点与计算节点通过通信网络相互交互。深度神经网络在此场景中扮演重要角色,提供了强大的决策机制,但需要高昂的计算代价。因此,最先进的深度神经网络常被分割到不同计算节点上运行——例如,部分部署于嵌入式设备,其余部分部署于服务器。决定深度神经网络的拆分位置本身就是一项挑战,这使得深度学习应用的设计更加复杂。为此,我们提出Split-Et-Impera这一新颖实用的框架,该框架能够:i) 基于深度网络可解释性原则,无需进行繁琐的试错实验即可确定神经网络的最优拆分点集合;ii) 进行通信感知仿真,快速评估不同神经网络重组方案;iii) 为应用的服务质量需求与精度及延迟时间等性能指标找到最佳匹配方案。