Click-Through Rate (CTR) prediction, which aims to estimate the probability of a user clicking on an item, is a key task in online advertising. Numerous existing CTR models concentrate on modeling the feature interactions within a solitary domain, thereby rendering them inadequate for fulfilling the requisites of multi-domain recommendations in real industrial scenarios. Some recent approaches propose intricate architectures to enhance knowledge sharing and augment model training across multiple domains. However, these approaches encounter difficulties when being transferred to new recommendation domains, owing to their reliance on the modeling of ID features (e.g., item id). To address the above issue, we propose the Universal Feature Interaction Network (UFIN) approach for CTR prediction. UFIN exploits textual data to learn universal feature interactions that can be effectively transferred across diverse domains. For learning universal feature representations, we regard the text and feature as two different modalities and propose an encoder-decoder network founded on a Large Language Model (LLM) to enforce the transfer of data from the text modality to the feature modality. Building upon the above foundation, we further develop a mixtureof-experts (MoE) enhanced adaptive feature interaction model to learn transferable collaborative patterns across multiple domains. Furthermore, we propose a multi-domain knowledge distillation framework to enhance feature interaction learning. Based on the above methods, UFIN can effectively bridge the semantic gap to learn common knowledge across various domains, surpassing the constraints of ID-based models. Extensive experiments conducted on eight datasets show the effectiveness of UFIN, in both multidomain and cross-platform settings. Our code is available at https://github.com/RUCAIBox/UFIN.
翻译:点击率(CTR)预测旨在估计用户点击物品的概率,是在线广告中的关键任务。现有大量CTR模型专注于建模单一域内的特征交互,因此难以满足实际工业场景中多域推荐的需求。近期一些方法提出复杂架构以增强跨域知识共享与模型训练,但这些方法因依赖ID特征(如物品ID)建模,在迁移至新推荐域时面临困难。针对上述问题,我们提出面向CTR预测的通用特征交互网络(UFIN)方法。UFIN利用文本数据学习可有效跨域迁移的通用特征交互。为学习通用特征表示,我们将文本与特征视为两种不同模态,并提出基于大语言模型(LLM)的编码器-解码器网络,强制实现从文本模态到特征模态的数据迁移。在此基础之上,我们进一步开发了由混合专家(MoE)增强的自适应特征交互模型,以学习跨多个域的可迁移协同模式。此外,我们提出多域知识蒸馏框架以增强特征交互学习。基于上述方法,UFIN能够有效弥合语义鸿沟,学习跨不同域的通用知识,突破基于ID模型的限制。在八个数据集上开展的大量实验表明,UFIN在多域及跨平台场景中均具有有效性。我们的代码已开源至https://github.com/RUCAIBox/UFIN。