Personalized Elastic Embedding Learning for On-Device Recommendation

To address privacy concerns and reduce network latency, there has been a recent trend of compressing cumbersome recommendation models trained on the cloud and deploying compact recommender models to resource-limited devices for real-time recommendation. Existing solutions generally overlook device heterogeneity and user heterogeneity. They either require all devices to share the same compressed model or the devices with the same resource budget to share the same model. However, even users with the same devices may have different preferences. In addition, they assume the available resources (e.g., memory) for the recommender on a device are constant, which is not reflective of reality. In light of device and user heterogeneities as well as dynamic resource constraints, this paper proposes a Personalized Elastic Embedding Learning framework (PEEL) for on-device recommendation, which generates personalized embeddings for devices with various memory budgets in once-for-all manner, efficiently adapting to new or dynamic budgets, and effectively addressing user preference diversity by assigning personalized embeddings for different groups of users. Specifically, it pretrains using user-item interaction instances to generate the global embedding table and cluster users into groups. Then, it refines the embedding tables with local interaction instances within each group. Personalized elastic embedding is generated from the group-wise embedding blocks and their weights that indicate the contribution of each embedding block to the local recommendation performance. PEEL efficiently generates personalized elastic embeddings by selecting embedding blocks with the largest weights, making it adaptable to dynamic memory budgets. Extensive experiments are conducted on two public datasets, and the results show that PEEL yields superior performance on devices with heterogeneous and dynamic memory budgets.

翻译：为解决隐私顾虑并降低网络延迟，近期趋势是将云端训练的庞大推荐模型压缩后部署至资源受限设备上，实现实时推荐。现有方案通常忽视设备异构性与用户异构性，要么要求所有设备共享同一压缩模型，要么要求资源预算相同的设备共享同一模型。然而，即便使用相同设备的用户也可能存在不同偏好。此外，这些方案假设设备上推荐系统的可用资源（如内存）是恒定的，这与现实不符。针对设备与用户的异构性以及动态资源约束，本文提出一种面向端侧推荐的个性化弹性嵌入学习框架（PEEL），该框架能够以一次性方式为不同内存预算的设备生成个性化嵌入，高效适配新预算或动态变化的预算，并通过为不同用户群组分配个性化嵌入，有效应对用户偏好多样性。具体而言，该框架首先利用用户-物品交互实例预训练全局嵌入表，并将用户聚类为群组；随后在每组内利用局部交互实例优化嵌入表。基于群组级嵌入块及其反映各嵌入块对局部推荐性能贡献的权重，生成个性化弹性嵌入。PEEL通过选取权重最大的嵌入块高效生成个性化弹性嵌入，使其可适配动态内存预算。在两个公开数据集上的大量实验表明，PEEL在异构与动态内存预算设备上均展现出优越性能。