Continual Collaborative Distillation for Recommender System

Knowledge distillation (KD) has emerged as a promising technique for addressing the computational challenges associated with deploying large-scale recommender systems. KD transfers the knowledge of a massive teacher system to a compact student model, to reduce the huge computational burdens for inference while retaining high accuracy. The existing KD studies primarily focus on one-time distillation in static environments, leaving a substantial gap in their applicability to real-world scenarios dealing with continuously incoming users, items, and their interactions. In this work, we delve into a systematic approach to operating the teacher-student KD in a non-stationary data stream. Our goal is to enable efficient deployment through a compact student, which preserves the high performance of the massive teacher, while effectively adapting to continuously incoming data. We propose Continual Collaborative Distillation (CCD) framework, where both the teacher and the student continually and collaboratively evolve along the data stream. CCD facilitates the student in effectively adapting to new data, while also enabling the teacher to fully leverage accumulated knowledge. We validate the effectiveness of CCD through extensive quantitative, ablative, and exploratory experiments on two real-world datasets. We expect this research direction to contribute to narrowing the gap between existing KD studies and practical applications, thereby enhancing the applicability of KD in real-world systems.

翻译：知识蒸馏（KD）已成为解决大规模推荐系统部署相关计算挑战的一种有前景的技术。KD将庞大教师系统的知识迁移至紧凑的学生模型，从而在保持高精度的同时降低推理所需的巨大计算负担。现有KD研究主要关注静态环境中的一次性蒸馏，在处理持续流入的用户、物品及其交互的真实场景时存在显著的适用性差距。本工作中，我们深入探讨了在非平稳数据流中运行师生KD的系统性方法。我们的目标是通过紧凑的学生模型实现高效部署，该模型既能保持庞大教师的高性能，又能有效适应持续流入的数据。我们提出了持续协同蒸馏（CCD）框架，其中教师和学生沿数据流持续协同演化。CCD使学生能有效适应新数据，同时使教师能充分利用累积知识。我们在两个真实数据集上通过大量定量、消融和探索性实验验证了CCD的有效性。我们期望这一研究方向有助于缩小现有KD研究与实际应用之间的差距，从而提升KD在真实系统中的适用性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日