Adaptive Compression-Aware Split Learning and Inference for Enhanced Network Efficiency

The growing number of AI-driven applications in mobile devices has led to solutions that integrate deep learning models with the available edge-cloud resources. Due to multiple benefits such as reduction in on-device energy consumption, improved latency, improved network usage, and certain privacy improvements, split learning, where deep learning models are split away from the mobile device and computed in a distributed manner, has become an extensively explored topic. Incorporating compression-aware methods (where learning adapts to compression level of the communicated data) has made split learning even more advantageous. This method could even offer a viable alternative to traditional methods, such as federated learning techniques. In this work, we develop an adaptive compression-aware split learning method ('deprune') to improve and train deep learning models so that they are much more network-efficient, which would make them ideal to deploy in weaker devices with the help of edge-cloud resources. This method is also extended ('prune') to very quickly train deep learning models through a transfer learning approach, which trades off little accuracy for much more network-efficient inference abilities. We show that the 'deprune' method can reduce network usage by 4x when compared with a split-learning approach (that does not use our method) without loss of accuracy, while also improving accuracy over compression-aware split-learning by 4 percent. Lastly, we show that the 'prune' method can reduce the training time for certain models by up to 6x without affecting the accuracy when compared against a compression-aware split-learning approach.

翻译：随着移动设备中AI驱动应用的增长，出现了将深度学习模型与可用边缘云资源相结合的方法。由于在降低设备能耗、改善延迟、优化网络使用以及提升隐私保护等方面的多重优势，将深度学习模型从移动设备拆分并以分布式方式计算的拆分学习已成为广泛研究的课题。融合压缩感知方法（使学习适应通信数据的压缩水平）进一步增强了拆分学习的优势，该方法甚至可为联邦学习等传统技术提供可行替代方案。本研究开发了一种自适应压缩感知拆分学习方法（"deprune"），用于改进和训练深度学习模型，使其具有更高的网络效率，从而借助边缘云资源在性能较弱的设备上实现理想部署。该方法还通过迁移学习方法扩展（"prune"），能够快速训练深度学习模型，以略微牺牲精度为代价换取更高效的网络推理能力。实验表明，与未采用本方法的拆分学习相比，"deprune"方法可在不损失精度的情况下将网络使用量降低4倍，同时相比压缩感知拆分学习将精度提升4%。最后，相较于压缩感知拆分学习，"prune"方法可将特定模型的训练时间缩短至原来的六分之一，且不影响精度。