The emergence of Pre-trained Language Models (PLMs) has achieved tremendous success in the field of Natural Language Processing (NLP) by learning universal representations on large corpora in a self-supervised manner. The pre-trained models and the learned representations can be beneficial to a series of downstream NLP tasks. This training paradigm has recently been adapted to the recommendation domain and is considered a promising approach by both academia and industry. In this paper, we systematically investigate how to extract and transfer knowledge from pre-trained models learned by different PLM-related training paradigms to improve recommendation performance from various perspectives, such as generality, sparsity, efficiency and effectiveness. Specifically, we propose a comprehensive taxonomy to divide existing PLM-based recommender systems w.r.t. their training strategies and objectives. Then, we analyze and summarize the connection between PLM-based training paradigms and different input data types for recommender systems. Finally, we elaborate on open issues and future research directions in this vibrant field.
翻译:预训练语言模型(PLMs)通过在大规模语料库上进行自监督学习通用表示,已在自然语言处理领域取得巨大成功。预训练模型及其习得的表示能够惠及一系列下游自然语言处理任务。这种训练范式近来被引入推荐领域,并被学界和产业界视为极具前景的方法。本文系统探究如何从不同PLM相关训练范式预训练的模型中提取和迁移知识,以从通用性、稀疏性、效率与有效性等多维度提升推荐性能。具体而言,我们提出一个综合性分类体系,基于训练策略与目标对现有基于PLM的推荐系统进行划分。进而,我们分析并归纳PLM训练范式与推荐系统不同输入数据类型之间的关联。最后,我们详述了这一活跃领域中的开放问题与未来研究方向。