Implicit feedback, such as user clicks, serves as the primary data source for modern recommender systems. However, click interactions inherently contain substantial noise, including accidental clicks, clickbait-induced interactions, and exploratory browsing behaviors that do not reflect genuine user preferences. Training recommendation models with such noisy positive samples leads to degraded prediction accuracy and unreliable recommendations. In this paper, we propose SAID (Semantics-Aware Implicit Denoising), a simple yet effective framework that leverages semantic consistency between user interests and item content to identify and downweight potentially noisy interactions. Our approach constructs textual user interest profiles from historical behaviors and computes semantic similarity with target item descriptions using pre-trained language model (PLM) based text encoders. The similarity scores are then transformed into sample weights that modulate the training loss, effectively reducing the impact of semantically inconsistent clicks. Unlike existing denoising methods that require complex auxiliary networks or multi-stage training procedures, SAID only modifies the loss function while keeping the backbone recommendation model unchanged. Extensive experiments on two real-world datasets demonstrate that SAID consistently improves recommendation performance, achieving up to 2.2% relative improvement in AUC over strong baselines, with particularly notable robustness under high noise conditions.
翻译:隐式反馈(如用户点击)是现代推荐系统的主要数据源。然而,点击交互本身包含大量噪声,包括偶然点击、诱导点击产生的交互以及不反映真实用户偏好的探索性浏览行为。使用此类噪声正样本训练推荐模型会导致预测准确性下降和推荐结果不可靠。本文提出SAID(语义感知隐式去噪),这是一个简单而有效的框架,利用用户兴趣与物品内容之间的语义一致性来识别并降低潜在噪声交互的权重。我们的方法从历史行为构建文本化用户兴趣画像,并使用基于预训练语言模型(PLM)的文本编码器计算其与目标物品描述的语义相似度。相似度分数随后被转化为样本权重以调节训练损失,有效降低语义不一致点击的影响。与现有需要复杂辅助网络或多阶段训练流程的去噪方法不同,SAID仅修改损失函数而保持主干推荐模型不变。在两个真实数据集上的大量实验表明,SAID能持续提升推荐性能,在AUC指标上较强基线最高获得2.2%的相对提升,并在高噪声条件下表现出尤为显著的鲁棒性。