In modern commercial search engines and recommendation systems, data from multiple domains is available to jointly train the multi-domain model. Traditional methods train multi-domain models in the multi-task setting, with shared parameters to learn the similarity of multiple tasks, and task-specific parameters to learn the divergence of features, labels, and sample distributions of individual tasks. With the development of large language models, LLM can extract global domain-invariant text features that serve both search and recommendation tasks. We propose a novel framework called S\&R Multi-Domain Foundation, which uses LLM to extract domain invariant features, and Aspect Gating Fusion to merge the ID feature, domain invariant text features and task-specific heterogeneous sparse features to obtain the representations of query and item. Additionally, samples from multiple search and recommendation scenarios are trained jointly with Domain Adaptive Multi-Task module to obtain the multi-domain foundation model. We apply the S\&R Multi-Domain foundation model to cold start scenarios in the pretrain-finetune manner, which achieves better performance than other SOTA transfer learning methods. The S\&R Multi-Domain Foundation model has been successfully deployed in Alipay Mobile Application's online services, such as content query recommendation and service card recommendation, etc.
翻译:在现代商业搜索引擎和推荐系统中,多领域数据可用于联合训练多领域模型。传统方法在多任务设置下训练多领域模型,通过共享参数学习多任务的相似性,并通过任务特定参数学习各任务特征、标签及样本分布的差异性。随着大语言模型的发展,LLM能够提取服务于搜索和推荐任务的全局领域不变文本特征。我们提出名为S&R多领域基础模型的新型框架,该框架利用LLM提取领域不变特征,通过方面门控融合技术将ID特征、领域不变文本特征及任务特定异构稀疏特征进行融合,从而获取查询和项目的表征。此外,通过领域自适应多任务模块联合训练来自多个搜索和推荐场景的样本,获得多领域基础模型。我们将该模型以预训练-微调方式应用于冷启动场景,实现了优于其他最先进迁移学习方法的性能。S&R多领域基础模型已成功部署于支付宝移动应用的在线服务中,包括内容查询推荐和服务卡片推荐等场景。