Pre-trained Recommender Systems: A Causal Debiasing Perspective

Recent studies on pre-trained vision/language models have demonstrated the practical benefit of a new, promising solution-building paradigm in AI where models can be pre-trained on broad data describing a generic task space and then adapted successfully to solve a wide range of downstream tasks, even when training data is severely limited (e.g., in zero- or few-shot learning scenarios). Inspired by such progress, we investigate in this paper the possibilities and challenges of adapting such a paradigm to the context of recommender systems, which is less investigated from the perspective of pre-trained model. In particular, we propose to develop a generic recommender that captures universal interaction patterns by training on generic user-item interaction data extracted from different domains, which can then be fast adapted to improve few-shot learning performance in unseen new domains (with limited data). However, unlike vision/language data which share strong conformity in the semantic space, universal patterns underlying recommendation data collected across different domains (e.g., different countries or different E-commerce platforms) are often occluded by both in-domain and cross-domain biases implicitly imposed by the cultural differences in their user and item bases, as well as their uses of different e-commerce platforms. As shown in our experiments, such heterogeneous biases in the data tend to hinder the effectiveness of the pre-trained model. To address this challenge, we further introduce and formalize a causal debiasing perspective, which is substantiated via a hierarchical Bayesian deep learning model, named PreRec. Our empirical studies on real-world data show that the proposed model could significantly improve the recommendation performance in zero- and few-shot learning settings under both cross-market and cross-platform scenarios.

翻译：近期关于预训练视觉/语言模型的研究表明，一种新型且极具前景的AI解决方案构建范式具有实际价值——模型可在描述通用任务空间的广泛数据上进行预训练，随后成功适配以解决各类下游任务，即使在训练数据严重受限（如零样本或少样本学习场景）时亦然。受此进展启发，本文研究了将此类范式适配至推荐系统领域的可能性与挑战，而该领域从预训练模型视角的研究尚不充分。具体而言，我们提出构建一个通用推荐器，通过在不同领域采集的通用用户-物品交互数据上进行训练，捕捉通用交互模式，随后可快速适配以改善未见新领域（数据有限）中的少样本学习性能。然而，与语义空间具有强一致性的视觉/语言数据不同，跨领域（如不同国家或不同电商平台）采集的推荐数据中蕴含的通用模式，常被用户与物品基数中文化差异以及不同电商平台使用习惯所隐含的领域内与跨领域偏差所遮蔽。如实验所示，数据中的此类异质性偏差会阻碍预训练模型的有效性。为应对该挑战，我们进一步引入并形式化了一种因果去偏视角，并通过分层贝叶斯深度学习模型PreRec加以实现。基于真实数据的实证研究表明，所提模型在跨市场与跨平台场景下的零样本与少样本学习设置中，均能显著提升推荐性能。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日