We are witnessing an increasing availability of streaming data that may contain valuable information on the underlying processes. It is thus attractive to be able to deploy machine learning models on edge devices near sensors such that decisions can be made instantaneously, rather than first having to transmit incoming data to servers. To enable deployment on edge devices with limited storage and computational capabilities, the full-precision parameters in standard models can be quantized to use fewer bits. The resulting quantized models are then calibrated using back-propagation and full training data to ensure accuracy. This one-time calibration works for deployments in static environments. However, model deployment in dynamic edge environments call for continual calibration to adaptively adjust quantized models to fit new incoming data, which may have different distributions. The first difficulty in enabling continual calibration on the edge is that the full training data may be too large and thus not always available on edge devices. The second difficulty is that the use of back-propagation on the edge for repeated calibration is too expensive. We propose QCore to enable continual calibration on the edge. First, it compresses the full training data into a small subset to enable effective calibration of quantized models with different bit-widths. We also propose means of updating the subset when new streaming data arrives to reflect changes in the environment, while not forgetting earlier training data. Second, we propose a small bit-flipping network that works with the subset to update quantized model parameters, thus enabling efficient continual calibration without back-propagation. An experimental study, conducted with real-world data in a continual learning setting, offers insight into the properties of QCore and shows that it is capable of outperforming strong baseline methods.
翻译:我们正目睹流式数据的日益普及,这些数据可能包含关于底层过程的宝贵信息。因此,将机器学习模型部署在传感器附近的边缘设备上,以实现即时决策(而非先将传入数据传输至服务器)变得颇具吸引力。为能在存储和计算能力有限的边缘设备上部署,可将标准模型中的全精度参数进行量化以使用更少的比特位。随后,利用反向传播和完整训练数据对量化模型进行校准,以确保其精度。这种一次性校准适用于静态环境中的部署。然而,在动态边缘环境中部署模型需要持续校准,以自适应调整量化模型来适应可能具有不同分布的新传入数据。在边缘实现持续校准的第一重困难在于,完整训练数据可能过大,因而无法随时在边缘设备上使用。第二重困难在于,在边缘设备上使用反向传播进行重复校准成本过高。我们提出QCore,以实现边缘上的持续校准。首先,它将完整训练数据压缩成一个小子集,从而能够对具有不同比特宽度的量化模型进行有效校准。我们还提出了一种方法,当新的流式数据到达时可更新该子集以反映环境变化,同时不遗忘先前的训练数据。其次,我们提出一个小型比特翻转网络,该网络与此子集协同工作,以更新量化模型参数,从而无需反向传播即可实现高效的持续校准。在持续学习场景中利用真实世界数据开展的实验研究,揭示了QCore的性质,并表明它能够超越强基线方法。