In this paper, we propose a novel approach to minimize the inference delay in semantic segmentation using split learning (SL), tailored to the needs of real-time computer vision (CV) applications for resource-constrained devices. Semantic segmentation is essential for applications such as autonomous vehicles and smart city infrastructure, but faces significant latency challenges due to high computational and communication loads. Traditional centralized processing methods are inefficient for such scenarios, often resulting in unacceptable inference delays. SL offers a promising alternative by partitioning deep neural networks (DNNs) between edge devices and a central server, enabling localized data processing and reducing the amount of data required for transmission. Our contribution includes the joint optimization of bandwidth allocation, cut layer selection of the edge devices' DNN, and the central server's processing resource allocation. We investigate both parallel and serial data processing scenarios and propose low-complexity heuristic solutions that maintain near-optimal performance while reducing computational requirements. Numerical results show that our approach effectively reduces inference delay, demonstrating the potential of SL for improving real-time CV applications in dynamic, resource-constrained environments.
翻译:本文提出一种新颖方法,通过采用分割学习(SL)来最小化语义分割的推理延迟,该方法专门针对资源受限设备的实时计算机视觉(CV)应用需求而设计。语义分割对于自动驾驶汽车和智慧城市基础设施等应用至关重要,但由于高计算和通信负载而面临显著的延迟挑战。传统的集中式处理方法在此类场景中效率低下,通常导致不可接受的推理延迟。SL通过在边缘设备和中央服务器之间划分深度神经网络(DNN),提供了一种有前景的替代方案,实现了本地化数据处理并减少了所需传输的数据量。我们的贡献包括联合优化带宽分配、边缘设备DNN的切割层选择以及中央服务器的处理资源分配。我们研究了并行和串行数据处理场景,并提出了低复杂度的启发式解决方案,这些方案在降低计算需求的同时保持了接近最优的性能。数值结果表明,我们的方法有效减少了推理延迟,证明了SL在动态、资源受限环境中改进实时CV应用的潜力。