With the continuous increase of users and items, conventional recommender systems trained on static datasets can hardly adapt to changing environments. The high-throughput data requires the model to be updated in a timely manner for capturing the user interest dynamics, which leads to the emergence of streaming recommender systems. Due to the prevalence of deep learning-based recommender systems, the embedding layer is widely adopted to represent the characteristics of users, items, and other features in low-dimensional vectors. However, it has been proved that setting an identical and static embedding size is sub-optimal in terms of recommendation performance and memory cost, especially for streaming recommendations. To tackle this problem, we first rethink the streaming model update process and model the dynamic embedding size search as a bandit problem. Then, we analyze and quantify the factors that influence the optimal embedding sizes from the statistics perspective. Based on this, we propose the \textbf{D}ynamic \textbf{E}mbedding \textbf{S}ize \textbf{S}earch (\textbf{DESS}) method to minimize the embedding size selection regret on both user and item sides in a non-stationary manner. Theoretically, we obtain a sublinear regret upper bound superior to previous methods. Empirical results across two recommendation tasks on four public datasets also demonstrate that our approach can achieve better streaming recommendation performance with lower memory cost and higher time efficiency.
翻译:随着用户和物品数量的持续增长,基于静态数据集训练的传统推荐系统难以适应动态变化的环境。高吞吐量数据要求模型及时更新以捕捉用户兴趣动态变化,从而催生了流式推荐系统。由于深度学习推荐系统的广泛采用,嵌入层被普遍用于将用户、物品及其他特征表示为低维向量。然而,已有研究表明,为所有特征设置统一且静态的嵌入维度在推荐性能和内存开销方面并非最优选择,尤其对于流式推荐场景。为解决该问题,我们首先重新审视了流式模型的更新过程,将动态嵌入尺寸搜索建模为赌博机问题。随后从统计角度分析并量化了影响最优嵌入尺寸的因素。基于此,我们提出动态嵌入尺寸搜索(DESS)方法,以非平稳方式最小化用户侧和物品侧的嵌入尺寸选择遗憾值。理论上,我们获得了优于过往方法的次线性遗憾上界。在四个公开数据集上的两项推荐任务实证结果同样表明,本方法能以更低内存消耗和更高时间效率实现更优的流式推荐性能。