Embedded distributed inference of Neural Networks has emerged as a promising approach for deploying machine-learning models on resource-constrained devices in an efficient and scalable manner. The inference task is distributed across a network of embedded devices, with each device contributing to the overall computation by performing a portion of the workload. In some cases, more powerful devices such as edge or cloud servers can be part of the system to be responsible of the most demanding layers of the network. As the demand for intelligent systems and the complexity of the deployed neural network models increases, this approach is becoming more relevant in a variety of applications such as robotics, autonomous vehicles, smart cities, Industry 4.0 and smart health. We present a systematic review of papers published during the last six years which describe techniques and methods to distribute Neural Networks across these kind of systems. We provide an overview of the current state-of-the-art by analysing more than 100 papers, present a new taxonomy to characterize them, and discuss trends and challenges in the field.
翻译:嵌入式神经网络分布式推理已成为一种有前景的方法,可在资源受限设备上高效且可扩展地部署机器学习模型。推理任务被分布到嵌入式设备网络中,每个设备通过执行部分计算负载来贡献整体计算。在某些情况下,更强大的设备(如边缘服务器或云服务器)可参与系统,负责处理网络中最复杂的层。随着智能系统需求增长及部署的神经网络模型复杂度提升,该方法在机器人、自动驾驶、智慧城市、工业4.0及智能健康等各类应用中日益重要。本文对过去六年中描述分布式神经网络技术方法的文献进行了系统性综述。通过分析100余篇论文,我们概述了当前研究现状,提出新的分类体系对其进行分类,并探讨了该领域的趋势与挑战。