Multi-domain learning (MDL) has become a prominent topic in enhancing the quality of personalized services. It's critical to learn commonalities between domains and preserve the distinct characteristics of each domain. However, this leads to a challenging dilemma in MDL. On the one hand, a model needs to leverage domain-aware modules such as experts or embeddings to preserve each domain's distinctiveness. On the other hand, real-world datasets often exhibit long-tailed distributions across domains, where some domains may lack sufficient samples to effectively train their specific modules. Unfortunately, nearly all existing work falls short of resolving this dilemma. To this end, we propose a novel Cross-experts Covariance Loss for Disentangled Learning model (Crocodile), which employs multiple embedding tables to make the model domain-aware at the embeddings which consist most parameters in the model, and a covariance loss upon these embeddings to disentangle them, enabling the model to capture diverse user interests among domains. Empirical analysis demonstrates that our method successfully addresses both challenges and outperforms all state-of-the-art methods on public datasets. During online A/B testing in Tencent's advertising platform, Crocodile achieves 0.72% CTR lift and 0.73% GMV lift on a primary advertising scenario.
翻译:多域学习已成为提升个性化服务质量的重要研究方向。其关键在于学习域间共性并保持各域独特性,但这导致多域学习面临一个两难困境:一方面,模型需要利用专家模块或嵌入表等域感知组件来保持各域特性;另一方面,现实数据集常呈现跨域长尾分布,部分域可能缺乏足够样本以有效训练其专用模块。遗憾的是,现有研究几乎都未能解决此困境。为此,我们提出一种新颖的跨专家协方差解耦学习模型(Crocodile),该模型采用多嵌入表使模型在参数量最大的嵌入层实现域感知,并通过嵌入表间的协方差损失实现解耦,使模型能够捕捉跨域多样化的用户兴趣。实证分析表明,我们的方法成功解决了上述双重挑战,并在公开数据集上超越了所有现有最优方法。在腾讯广告平台的在线A/B测试中,Crocodile在核心广告场景实现了0.72%点击率提升和0.73%交易总额提升。