DCNv3: Towards Next Generation Deep Cross Network for CTR Prediction

Deep & Cross Network and its derivative models have become an important paradigm in click-through rate (CTR) prediction due to their effective balance between computational cost and performance. However, these models face four major limitations: (1) while most models claim to capture high-order feature interactions, they often do so implicitly and non-interpretably through deep neural networks (DNN), which limits the trustworthiness of the model's predictions; (2) the performance of existing explicit feature interaction methods is often weaker than that of implicit DNN, undermining their necessity; (3) many models fail to adaptively filter noise while enhancing the order of feature interactions; (4) the fusion methods of most models cannot provide suitable supervision signals for their different interaction methods. To address the identified limitations, this paper proposes the next generation Deep Cross Network (DCNv3) and Shallow & Deep Cross Network (SDCNv3). These models ensure interpretability in feature interaction modeling while exponentially increasing the order of feature interactions to achieve genuine Deep Crossing rather than just Deep & Cross. Additionally, we employ a Self-Mask operation to filter noise and reduce the number of parameters in the cross network by half. In the fusion layer, we use a simple yet effective loss weight calculation method called Tri-BCE to provide appropriate supervision signals. Comprehensive experiments on six datasets demonstrate the effectiveness, efficiency, and interpretability of DCNv3 and SDCNv3. The code, running logs, and detailed hyperparameter configurations are available at: https://anonymous.4open.science/r/DCNv3-E352.

翻译：深度交叉网络及其衍生模型因其在计算成本与性能之间的有效平衡，已成为点击率预测领域的重要范式。然而，这些模型面临四个主要局限：（1）尽管大多数模型声称能够捕获高阶特征交互，但它们通常通过深度神经网络隐式且不可解释地实现，这限制了模型预测的可信度；（2）现有显式特征交互方法的性能往往弱于隐式DNN，削弱了其必要性；（3）许多模型在提升特征交互阶数的同时，未能自适应地过滤噪声；（4）大多数模型的融合方法无法为其不同的交互方式提供合适的监督信号。针对上述局限，本文提出了下一代深度交叉网络（DCNv3）与浅层-深度交叉网络（SDCNv3）。这些模型在确保特征交互建模可解释性的同时，以指数级提升交互阶数，实现真正的深度交叉，而非仅仅是“深度与交叉”。此外，我们采用自掩码操作来过滤噪声，并将交叉网络的参数量减少一半。在融合层，我们使用一种简单而有效的损失权重计算方法——Tri-BCE，以提供合适的监督信号。在六个数据集上的综合实验验证了DCNv3与SDCNv3的有效性、高效性和可解释性。代码、运行日志及详细超参数配置公开于：https://anonymous.4open.science/r/DCNv3-E352。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日