For many real-world time series tasks, the computational complexity of prevalent deep leaning models often hinders the deployment on resource-limited environments (e.g., smartphones). Moreover, due to the inevitable domain shift between model training (source) and deploying (target) stages, compressing those deep models under cross-domain scenarios becomes more challenging. Although some of existing works have already explored cross-domain knowledge distillation for model compression, they are either biased to source data or heavily tangled between source and target data. To this end, we design a novel end-to-end framework called Universal and joint knowledge distillation (UNI-KD) for cross-domain model compression. In particular, we propose to transfer both the universal feature-level knowledge across source and target domains and the joint logit-level knowledge shared by both domains from the teacher to the student model via an adversarial learning scheme. More specifically, a feature-domain discriminator is employed to align teacher's and student's representations for universal knowledge transfer. A data-domain discriminator is utilized to prioritize the domain-shared samples for joint knowledge transfer. Extensive experimental results on four time series datasets demonstrate the superiority of our proposed method over state-of-the-art (SOTA) benchmarks.
翻译:针对许多真实世界的时间序列任务,主流深度学习模型的高计算复杂度往往阻碍其在资源受限环境(如智能手机)中的部署。此外,由于模型训练(源域)与部署(目标域)阶段之间不可避免的域偏移,跨场景压缩这些深度模型更具挑战性。尽管现有部分研究已探索了利用跨领域知识蒸馏实现模型压缩,但这些方法要么偏向源域数据,要么在源域与目标域数据间存在严重耦合。为此,我们设计了一种新型端到端框架——通用与联合知识蒸馏(UNI-KD),用于跨领域模型压缩。具体而言,我们提出通过对抗学习策略,将源域与目标域间的通用特征级知识以及两者共享的联合逻辑级知识从教师模型迁移至学生模型。其中,特征域判别器用于对齐教师与学生模型的表征以实现通用知识迁移,数据域判别器则用于优先选择域共享样本以实现联合知识迁移。在四个时间序列数据集上的大量实验结果表明,我们提出的方法显著优于当前最先进(SOTA)的基准模型。