Sequential user behavior modeling is pivotal for Click-Through Rate (CTR) prediction yet is hindered by three intrinsic bottlenecks: (1) the "Attention Sink" phenomenon, where standard Softmax compels the model to allocate probability mass to noisy behaviors; (2) the Static Query Assumption, which overlooks dynamic shifts in user intent driven by real-time contexts; and (3) Rigid View Aggregation, which fails to adaptively weight heterogeneous temporal signals according to the decision context. To bridge these gaps, we propose GAP-Net (Gated Adaptive Progressive Network), a unified framework establishing a "Triple Gating" architecture to progressively refine information from micro-level features to macro-level views. GAP-Net operates through three integrated mechanisms: (1) Adaptive Sparse-Gated Attention (ASGA) employs micro-level gating to enforce sparsity, effectively suppressing massive noise activations; (2) Gated Cascading Query Calibration (GCQC) dynamically aligns user intent by bridging real-time triggers and long-term memories via a meso-level cascading channel; and (3) Context-Gated Denoising Fusion (CGDF) performs macro-level modulation to orchestrate the aggregation of multi-view sequences. Extensive experiments on industrial datasets demonstrate that GAP-Net achieves substantial improvements over state-of-the-art baselines, exhibiting superior robustness against interaction noise and intent drift.
翻译:序列用户行为建模对于点击率预测至关重要,但受到三个内在瓶颈的制约:(1) "注意力汇"现象,即标准Softmax迫使模型将概率质量分配给噪声行为;(2) 静态查询假设,其忽略了由实时上下文驱动的用户意图动态变化;(3) 刚性视图聚合,其无法根据决策上下文自适应地加权异构时序信号。为弥合这些差距,我们提出了GAP-Net(门控自适应渐进网络),这是一个统一的框架,建立了一个"三重门控"架构,以从微观特征到宏观视图逐步精炼信息。GAP-Net通过三个集成机制运行:(1) 自适应稀疏门控注意力采用微观级门控来强制稀疏性,有效抑制大量噪声激活;(2) 门控级联查询校准通过中观级级联通道桥接实时触发器和长期记忆,动态对齐用户意图;(3) 上下文门控去噪融合执行宏观级调制,以协调多视图序列的聚合。在工业数据集上的大量实验表明,GAP-Net相较于最先进的基线模型取得了显著提升,展现出对抗交互噪声和意图漂移的卓越鲁棒性。