CELA: Cost-Efficient Language Model Alignment for CTR Prediction

Click-Through Rate (CTR) prediction holds a paramount position in recommender systems. The prevailing ID-based paradigm underperforms in cold-start scenarios due to the skewed distribution of feature frequency. Additionally, the utilization of a single modality fails to exploit the knowledge contained within textual features. Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs). They design hard prompts to structure raw features into text for each interaction and then apply PLMs for text processing. With external knowledge and reasoning capabilities, PLMs extract valuable information even in cases of sparse interactions. Nevertheless, compared to ID-based models, pure text modeling degrades the efficacy of collaborative filtering, as well as feature scalability and efficiency during both training and inference. To address these issues, we propose \textbf{C}ost-\textbf{E}fficient \textbf{L}anguage Model \textbf{A}lignment (\textbf{CELA}) for CTR prediction. CELA incorporates textual features and language models while preserving the collaborative filtering capabilities of ID-based models. This model-agnostic framework can be equipped with plug-and-play textual features, with item-level alignment enhancing the utilization of external information while maintaining training and inference efficiency. Through extensive offline experiments, CELA demonstrates superior performance compared to state-of-the-art methods. Furthermore, an online A/B test conducted on an industrial App recommender system showcases its practical effectiveness, solidifying the potential for real-world applications of CELA.

翻译：点击率（CTR）预测在推荐系统中占据至关重要的地位。由于特征频率分布存在偏斜，当前主流的基于ID的范式在冷启动场景下表现不佳。此外，单一模态的利用未能充分挖掘文本特征中蕴含的知识。近期研究尝试通过集成预训练语言模型（PLMs）来缓解这些挑战。这些方法设计硬提示将原始特征为每次交互构造成文本，随后应用PLMs进行文本处理。借助外部知识和推理能力，PLMs即使在交互稀疏的情况下也能提取有价值的信息。然而，与基于ID的模型相比，纯文本建模会降低协同过滤的效果，同时影响训练和推理过程中的特征可扩展性与效率。为解决这些问题，我们提出面向点击率预测的**成本高效语言模型对齐方法（CELA）**。CELA在引入文本特征和语言模型的同时，保留了基于ID模型的协同过滤能力。这一模型无关的框架可配备即插即用的文本特征，其中项目级对齐机制在保持训练和推理效率的同时，提升了外部信息的利用效率。通过大量离线实验，CELA相较于现有最优方法展现出更优越的性能。此外，在工业级应用推荐系统上进行的在线A/B测试验证了其实际有效性，巩固了CELA在现实场景中的应用潜力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日