In recent years, the use of artificial intelligence on resource-constrained IoT devices has grown significantly. However, existing approaches to DNN partitioning and offloading across the edge-cloud continuum typically rely on static methods that ignore runtime dynamics. Furthermore, they are often evaluated in simulated environments rather than on real hardware. To address this gap, we propose a framework that dynamically splits neural network layers across the heterogeneous continuum. The framework profiles the model at startup, measures network link conditions between nodes, and periodically re-evaluates the partition to adapt to environmental changes. We created a physical testbed comprising a Raspberry Pi edge device, a laptop fog, and a high-performance desktop PC as the cloud. We evaluated the framework over three widely adopted convolutional neural networks: VGG16, AlexNet, and MobileNetV2. Our results show that the framework achieves reductions in energy and end-to-end latency of 27.09--35.82% and 6.34--22.92%, respectively, compared to a static partitioning baseline. These findings confirm the superiority of adaptive to static partitioning.
翻译:近年来,人工智能在资源受限的物联网设备上的应用显著增长。然而,现有针对边缘-云连续体的深度神经网络(DNN)分区与卸载方法通常依赖忽略运行时动态的静态方案。此外,这些方法多在模拟环境而非真实硬件上评估。为填补这一空白,我们提出了一种在异构连续体上动态拆分神经网络层的框架。该框架在启动时对模型进行性能剖析,测量节点间的网络链路条件,并周期性地重新评估分区以适配环境变化。我们构建了包含树莓派边缘设备、笔记本电脑雾节点和高性能台式机云节点的物理测试平台,基于三种广泛采用的卷积神经网络(VGG16、AlexNet和MobileNetV2)进行了评估。结果表明,与静态分区基准相比,该框架分别实现了27.09%-35.82%的能耗降低和6.34%-22.92%的端到端延迟减少。这些发现证实了自适应分区相比静态分区的优越性。