A Collaborative Ensemble Framework for CTR Prediction

Xiaolong Liu,Zhichen Zeng,Xiaoyi Liu,Siyang Yuan,Weinan Song,Mengyue Hang,Yiqun Liu,Chaofei Yang,Donghyun Kim,Wen-Yen Chen,Jiyan Yang,Yiping Han,Rong Jin,Bo Long,Hanghang Tong,Philip S. Yu

Recent advances in foundation models have established scaling laws that enable the development of larger models to achieve enhanced performance, motivating extensive research into large-scale recommendation models. However, simply increasing the model size in recommendation systems, even with large amounts of data, does not always result in the expected performance improvements. In this paper, we propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models, each with its own embedding table, to capture unique feature interaction patterns. Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning, where models iteratively refine their predictions. To dynamically balance contributions from each model, we introduce a confidence-based fusion mechanism using general softmax, where model confidence is computed via negation entropy. This design ensures that more confident models have a greater influence on the final prediction while benefiting from the complementary strengths of other models. We validate our framework on three public datasets (AmazonElectronics, TaobaoAds, and KuaiVideo) as well as a large-scale industrial dataset from Meta, demonstrating its superior performance over individual models and state-of-the-art baselines. Additionally, we conduct further experiments on the Criteo and Avazu datasets to compare our method with the multi-embedding paradigm. Our results show that our framework achieves comparable or better performance with smaller embedding sizes, offering a scalable and efficient solution for CTR prediction tasks.

翻译：基础模型的最新进展确立了缩放定律，使得开发更大模型以获得增强性能成为可能，这推动了对大规模推荐模型的广泛研究。然而，在推荐系统中，即使拥有大量数据，单纯增加模型规模并不总能带来预期的性能提升。本文提出了一种新颖的框架——协同集成训练网络（CETNet），该框架利用多个具有各自嵌入表的独立模型来捕获独特的特征交互模式。与简单的模型缩放不同，我们的方法通过协同学习强调多样性与协作性，使模型能够迭代优化其预测。为了动态平衡各模型的贡献，我们引入了一种基于置信度的通用softmax融合机制，其中模型置信度通过负熵计算。这一设计确保置信度更高的模型对最终预测具有更大影响力，同时受益于其他模型的互补优势。我们在三个公共数据集（AmazonElectronics、TaobaoAds和KuaiVideo）以及Meta提供的大规模工业数据集上验证了该框架，证明其性能优于单一模型和最先进的基线方法。此外，我们在Criteo和Avazu数据集上进行了进一步实验，以比较我们的方法与多嵌入范式。结果表明，我们的框架在更小的嵌入尺寸下实现了相当或更优的性能，为CTR预测任务提供了一种可扩展且高效的解决方案。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日