Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

The conventional deep learning paradigm often involves training a deep model on a server and then deploying the model or its distilled ones to resource-limited edge devices. Usually, the models shall remain fixed once deployed (at least for some period) due to the potential high cost of model adaptation for both the server and edge sides. However, in many real-world scenarios, the test environments may change dynamically (known as distribution shifts), which often results in degraded performance. Thus, one has to adapt the edge models promptly to attain promising performance. Moreover, with the increasing data collected at the edge, this paradigm also fails to further adapt the cloud model for better performance. To address these, we encounter two primary challenges: 1) the edge model has limited computation power and may only support forward propagation; 2) the data transmission budget between cloud and edge devices is limited in latency-sensitive scenarios. In this paper, we establish a Cloud-Edge Elastic Model Adaptation (CEMA) paradigm in which the edge models only need to perform forward propagation and the edge models can be adapted online. In our CEMA, to reduce the communication burden, we devise two criteria to exclude unnecessary samples from uploading to the cloud, i.e., dynamic unreliable and low-informative sample exclusion. Based on the uploaded samples, we update and distribute the affine parameters of normalization layers by distilling from the stronger foundation model to the edge model with a sample replay strategy. Extensive experimental results on ImageNet-C and ImageNet-R verify the effectiveness of our CEMA.

翻译：传统深度学习范式通常涉及在服务器上训练深度模型，然后将模型或其蒸馏版本部署到资源受限的边缘设备。由于服务器端和边缘端的模型自适应潜在成本较高，模型一旦部署（至少在特定时期内）通常保持固定。然而在众多实际场景中，测试环境可能动态变化（即分布偏移），往往导致性能下降。因此必须及时调整边缘模型以获得理想性能。此外，随着边缘端采集数据量的增加，该范式也无法进一步优化云端模型以提升性能。为应对这些挑战，我们面临两个主要问题：1）边缘模型计算能力有限，可能仅支持前向传播；2）在延迟敏感场景下，云边设备间的数据传输预算受限。本文提出云边弹性模型自适应（CEMA）范式，其中边缘模型仅需执行前向传播且可实现在线自适应。在CEMA中，为降低通信负担，我们设计了两项准则来排除无需上传至云端的不必要样本，即动态不可靠样本与低信息量样本的剔除。基于上传样本，我们通过从更强的基础模型向边缘模型进行蒸馏，并结合样本回放策略，更新并分发归一化层的仿射参数。在ImageNet-C和ImageNet-R数据集上的大量实验验证了CEMA的有效性。