As deep neural networks continue to expand and become more complex, most edge devices are unable to handle their extensive processing requirements. Therefore, the concept of distributed inference is essential to distribute the neural network among a cluster of nodes. However, distribution may lead to additional energy consumption and dependency among devices that suffer from unstable transmission rates. Unstable transmission rates harm real-time performance of IoT devices causing low latency, high energy usage, and potential failures. Hence, for dynamic systems, it is necessary to have a resilient DNN with an adaptive architecture that can downsize as per the available resources. This paper presents an empirical study that identifies the connections in ResNet that can be dropped without significantly impacting the model's performance to enable distribution in case of resource shortage. Based on the results, a multi-objective optimization problem is formulated to minimize latency and maximize accuracy as per available resources. Our experiments demonstrate that an adaptive ResNet architecture can reduce shared data, energy consumption, and latency throughout the distribution while maintaining high accuracy.
翻译:随着深度神经网络的不断扩展和日益复杂,大多数边缘设备无法处理其庞大的计算需求。因此,将神经网络分布到节点集群中的分布式推理概念至关重要。然而,分布可能导致额外的能量消耗,并增加设备间的依赖性,这些设备常面临传输速率不稳定的问题。不稳定的传输速率会损害物联网设备的实时性能,导致延迟增大、能耗增高,甚至可能引发故障。因此,对于动态系统而言,需要一种具有自适应架构的弹性深度神经网络,能够根据可用资源进行缩减。本文通过实证研究,识别出ResNet中可在不影响模型性能的前提下被丢弃的连接,从而在资源短缺时实现分布式推理。基于研究结果,我们建立了一个多目标优化问题,旨在根据可用资源最小化延迟并最大化准确率。实验表明,自适应ResNet架构能够在分布式推理过程中减少共享数据、降低能量消耗和延迟,同时保持高准确率。