Distributed inference is a popular approach for efficient DNN inference at the edge. However, traditional Static and Dynamic DNNs are not distribution-friendly, causing system reliability and adaptability issues. In this paper, we introduce Fluid Dynamic DNNs (Fluid DyDNNs), tailored for distributed inference. Distinct from Static and Dynamic DNNs, Fluid DyDNNs utilize a novel nested incremental training algorithm to enable independent and combined operation of its sub-networks, enhancing system reliability and adaptability. Evaluation on embedded Arm CPUs with a DNN model and the MNIST dataset, shows that in scenarios of single device failure, Fluid DyDNNs ensure continued inference, whereas Static and Dynamic DNNs fail. When devices are fully operational, Fluid DyDNNs can operate in either a High-Accuracy mode and achieve comparable accuracy with Static DNNs, or in a High-Throughput mode and achieve 2.5x and 2x throughput compared with Static and Dynamic DNNs, respectively.
翻译:分布式推理是在边缘设备上实现高效深度神经网络推理的常用方法。然而,传统的静态与动态深度神经网络并不利于分布式部署,导致系统可靠性及自适应性问题。本文提出了一种专为分布式推理设计的流体动态深度神经网络(Fluid DyDNNs)。与静态及动态深度神经网络不同,Fluid DyDNNs采用一种新颖的嵌套增量训练算法,使其子网络能够独立运行及协同工作,从而增强系统可靠性与自适应性。在嵌入式Arm CPU上使用深度神经网络模型及MNIST数据集的评估表明:在单设备故障场景下,Fluid DyDNNs能够保证推理持续进行,而静态与动态深度神经网络则完全失效;当所有设备正常运行时,Fluid DyDNNs可选择高精度模式运行,达到与静态深度神经网络相当的精度,亦可选择高吞吐模式运行,其吞吐量分别为静态与动态深度神经网络的2.5倍和2倍。