Sequence-level Semantic Representation Fusion for Recommender Systems

With the rapid development of recommender systems, there is increasing side information that can be employed to improve the recommendation performance. Specially, we focus on the utilization of the associated \emph{textual data} of items (eg product title) and study how text features can be effectively fused with ID features in sequential recommendation. However, there exists distinct data characteristics for the two kinds of item features, making a direct fusion method (eg adding text and ID embeddings as item representation) become less effective. To address this issue, we propose a novel {\ul \emph{Te}}xt-I{\ul \emph{D}} semantic fusion approach for sequential {\ul \emph{Rec}}ommendation, namely \textbf{\our}. The core idea of our approach is to conduct a sequence-level semantic fusion approach by better integrating global contexts. The key strategy lies in that we transform the text embeddings and ID embeddings by Fourier Transform from \emph{time domain} to \emph{frequency domain}. In the frequency domain, the global sequential characteristics of the original sequences are inherently aggregated into the transformed representations, so that we can employ simple multiplicative operations to effectively fuse the two kinds of item features. Our fusion approach can be proved to have the same effects of contextual convolution, so as to achieving sequence-level semantic fusion. In order to further improve the fusion performance, we propose to enhance the discriminability of the text embeddings from the text encoder, by adaptively injecting positional information via a mixture-of-experts~(MoE) modulation method. Our implementation is available at this repository: \textcolor{magenta}{\url{https://github.com/RUCAIBox/TedRec}}.

翻译：随着推荐系统的快速发展，可利用的辅助信息日益增多，以提升推荐性能。本文重点关注项目关联文本数据（如商品标题）的利用，并研究如何将文本特征与身份特征在序列推荐中有效融合。然而，这两类项目特征存在显著的数据特性差异，导致直接融合方法（例如将文本嵌入和身份嵌入相加作为项目表征）效果不佳。为解决这一问题，我们提出了一种新颖的文本-身份语义融合方法用于序列推荐，即\textbf{\our}。该方法的核心思想是通过更好地整合全局上下文，实现序列级语义融合。关键策略在于利用傅里叶变换将文本嵌入和身份嵌入从\textit{时域}转换到\textit{频域}。在频域中，原始序列的全局序列特征被自然聚合到变换后的表征中，因此我们可采用简单的乘法运算来有效融合这两类项目特征。我们的融合方法被证明具有与上下文卷积相同的效果，从而实现序列级语义融合。为进一步提升融合性能，我们提出通过混合专家（MoE）调制方法自适应注入位置信息，以增强文本编码器输出的文本嵌入的可区分性。我们的实现已开源至：\textcolor{magenta}{\url{https://github.com/RUCAIBox/TedRec}}。