Click-Through Rate (CTR) prediction is a pivotal task in product and content recommendation, where learning effective feature embeddings is of great significance. However, traditional methods typically learn fixed feature representations without dynamically refining feature representations according to the context information, leading to suboptimal performance. Some recent approaches attempt to address this issue by learning bit-wise weights or augmented embeddings for feature representations, but suffer from uninformative or redundant features in the context. To tackle this problem, inspired by the Global Workspace Theory in conscious processing, which posits that only a specific subset of the product features are pertinent while the rest can be noisy and even detrimental to human-click behaviors, we propose a CTR model that enables Dynamic Embedding Learning with Truncated Conscious Attention for CTR prediction, termed DELTA. DELTA contains two key components: (I) conscious truncation module (CTM), which utilizes curriculum learning to apply adaptive truncation on attention weights to select the most critical feature in the context; (II) explicit embedding optimization (EEO), which applies an auxiliary task during training that directly and independently propagates the gradient from the loss layer to the embedding layer, thereby optimizing the embedding explicitly via linear feature crossing. Extensive experiments on five challenging CTR datasets demonstrate that DELTA achieves new state-of-art performance among current CTR methods.
翻译:点击率(CTR)预测是产品与内容推荐中的关键任务,学习有效特征嵌入具有重要意义。然而传统方法通常学习固定特征表示,未能根据上下文信息动态优化特征表达,导致性能欠佳。近期部分研究尝试通过比特级权重或增强嵌入来缓解该问题,但受限于上下文中的非信息性或冗余特征。为解决此问题,受意识处理中全局工作空间理论的启发——该理论指出仅特定子集的产品特征与用户点击行为相关,其余特征可能产生噪声甚至干扰——本文提出一种基于截断意识注意力的动态嵌入学习CTR预测模型DELTA。DELTA包含两个核心模块:(I)意识截断模块(CTM),通过课程学习机制对注意力权重进行自适应截断,筛选上下文中最关键的特征;(II)显式嵌入优化模块(EEO),在训练过程中引入辅助任务,使损失层梯度直接且独立地传播至嵌入层,从而通过线性特征交叉实现嵌入的显式优化。在五个具有挑战性的CTR数据集上的大量实验表明,DELTA在现有CTR方法中达到了新的最优性能。