Recently, the generality of natural language text has been leveraged to develop transferable recommender systems. The basic idea is to employ pre-trained language models~(PLM) to encode item text into item representations. Despite the promising transferability, the binding between item text and item representations might be too tight, leading to potential problems such as over-emphasizing the effect of text features and exaggerating the negative impact of domain gap. To address this issue, this paper proposes VQ-Rec, a novel approach to learning Vector-Quantized item representations for transferable sequential Recommenders. The main novelty of our approach lies in the new item representation scheme: it first maps item text into a vector of discrete indices (called item code), and then employs these indices to lookup the code embedding table for deriving item representations. Such a scheme can be denoted as "text $\Longrightarrow$ code $\Longrightarrow$ representation". Based on this representation scheme, we further propose an enhanced contrastive pre-training approach, using semi-synthetic and mixed-domain code representations as hard negatives. Furthermore, we design a new cross-domain fine-tuning method based on a differentiable permutation-based network. Extensive experiments conducted on six public benchmarks demonstrate the effectiveness of the proposed approach, in both cross-domain and cross-platform settings. Code and pre-trained model are available at: https://github.com/RUCAIBox/VQ-Rec.
翻译:最近,自然语言文本的通用性已被用于开发可迁移的推荐系统。基本思想是利用预训练语言模型(PLM)将项目文本编码为项目表示。尽管具有良好的可迁移性,但项目文本与项目表示之间的绑定可能过于紧密,导致潜在问题,例如过度强调文本特征的影响,并放大领域差异的负面效应。为解决此问题,本文提出VQ-Rec,一种学习向量量化项目表示以实现可迁移序列推荐器的新方法。该方法的主要创新在于新的项目表示方案:首先将项目文本映射为离散索引向量(称为项目代码),然后利用这些索引查找代码嵌入表以推导项目表示。这种方案可表示为“文本 $\Longrightarrow$ 代码 $\Longrightarrow$ 表示”。基于该表示方案,我们进一步提出一种增强的对比预训练方法,使用半合成和混合领域的代码表示作为硬负样本。此外,我们设计了一种基于可微置换网络的新型跨领域微调方法。在六个公开基准上进行的广泛实验证明了该方法在跨领域和跨平台设置中的有效性。代码和预训练模型可在 https://github.com/RUCAIBox/VQ-Rec 获取。