GenCI：基于群体意图学习的用户兴趣漂移生成式建模用于点击率预测 (GenCI: Generative Modeling of User Interest Shift via Cohort-based Intent Learning for CTR Prediction)

Click-through rate (CTR) prediction plays a pivotal role in online advertising and recommender systems. Despite notable progress in modeling user preferences from historical behaviors, two key challenges persist. First, exsiting discriminative paradigms focus on matching candidates to user history, often overfitting to historically dominant features and failing to adapt to rapid interest shifts. Second, a critical information chasm emerges from the point-wise ranking paradigm. By scoring each candidate in isolation, CTR models discard the rich contextual signal implied by the recalled set as a whole, leading to a misalignment where long-term preferences often override the user's immediate, evolving intent. To address these issues, we propose GenCI, a generative user intent framework that leverages semantic interest cohorts to model dynamic user preferences for CTR prediction. The framework first employs a generative model, trained with a next-item prediction (NTP) objective, to proactively produce candidate interest cohorts. These cohorts serve as explicit, candidate-agnostic representations of a user's immediate intent. A hierarchical candidate-aware network then injects this rich contextual signal into the ranking stage, refining them with cross-attention to align with both user history and the target item. The entire model is trained end-to-end, creating a more aligned and effective CTR prediction pipeline. Extensive experiments on three widely used datasets demonstrate the effectiveness of our approach.

翻译：点击率预测在在线广告和推荐系统中起着关键作用。尽管在从历史行为建模用户偏好方面取得了显著进展，但仍存在两个关键挑战。首先，现有的判别式范式侧重于将候选项目与用户历史进行匹配，往往过度拟合历史主导特征，无法适应快速的兴趣漂移。其次，逐点排序范式会产生关键的信息鸿沟。通过孤立地评分每个候选项目，CTR模型丢弃了召回集整体所蕴含的丰富上下文信号，导致长期偏好经常压倒用户即时、动态变化的意图，从而产生错位。为解决这些问题，我们提出了GenCI，一个生成式用户意图框架，它利用语义兴趣群体来建模动态用户偏好以进行CTR预测。该框架首先采用一个以下一项预测目标训练的生成模型，主动生成候选兴趣群体。这些群体作为用户即时意图的显式、与候选无关的表征。随后，一个分层的候选感知网络将这种丰富的上下文信号注入排序阶段，通过交叉注意力机制对其进行细化，使其与用户历史及目标项目对齐。整个模型以端到端方式进行训练，构建了一个更对齐且更有效的CTR预测流程。在三个广泛使用的数据集上进行的大量实验证明了我们方法的有效性。