Traditional click-through rate (CTR) prediction models convert the tabular data into one-hot vectors and leverage the collaborative relations among features for inferring user's preference over items. This modeling paradigm discards the essential semantic information. Though some recent works like P5 and M6-Rec have explored the potential of using Pre-trained Language Models (PLMs) to extract semantic signals for CTR prediction, they are computationally expensive and suffer from low efficiency. Besides, the beneficial collaborative relations are not considered, hindering the recommendation performance. To solve these problems, in this paper, we propose a novel framework \textbf{CTRL}, which is industrial friendly and model-agnostic with high training and inference efficiency. Specifically, the original tabular data is first converted into textual data. Both tabular data and converted textual data are regarded as two different modalities and are separately fed into the collaborative CTR model and pre-trained language model. A cross-modal knowledge alignment procedure is performed to fine-grained align and integrate the collaborative and semantic signals, and the lightweight collaborative model can be deployed online for efficient serving after fine-tuned with supervised signals. Experimental results on three public datasets show that CTRL outperforms the SOTA CTR models significantly. Moreover, we further verify its effectiveness on a large-scale industrial recommender system.
翻译:传统的点击率(CTR)预测模型将表格数据转换为独热向量,并利用特征间的协同关系来推断用户对物品的偏好。这种建模范式忽略了关键的语义信息。尽管近期一些工作(如P5和M6-Rec)探索了使用预训练语言模型(PLMs)提取语义信号进行CTR预测的潜力,但这些方法计算成本高昂且效率低下。此外,有益的协同关系未被考虑,阻碍了推荐性能的提升。为解决这些问题,本文提出一种新颖框架**CTRL**,该框架工业友好且与模型无关,具有较高的训练和推理效率。具体而言,原始表格数据首先被转换为文本数据。表格数据与转换后的文本数据被视为两种不同模态,分别输入协同CTR模型和预训练语言模型。通过跨模态知识对齐流程对协同信号与语义信号进行细粒度对齐与融合,轻量级协同模型在经监督信号微调后可部署至线上高效服务。在三个公开数据集上的实验结果表明,CTRL显著优于当前最先进的CTR模型。此外,我们进一步在大型工业推荐系统中验证了其有效性。