Sequential recommenders have been widely used in industry due to their strength in modeling user preferences. While these models excel at learning a user's positive interests, less attention has been paid to learning from negative user feedback. Negative user feedback is an important lever of user control, and comes with an expectation that recommenders should respond quickly and reduce similar recommendations to the user. However, negative feedback signals are often ignored in the training objective of sequential retrieval models, which primarily aim at predicting positive user interactions. In this work, we incorporate explicit and implicit negative user feedback into the training objective of sequential recommenders in the retrieval stage using a "not-to-recommend" loss function that optimizes for the log-likelihood of not recommending items with negative feedback. We demonstrate the effectiveness of this approach using live experiments on a large-scale industrial recommender system. Furthermore, we address a challenge in measuring recommender responsiveness to negative feedback by developing a counterfactual simulation framework to compare recommender responses between different user actions, showing improved responsiveness from the modeling change.
翻译:序列推荐器因其在建模用户偏好方面的优势而在工业界得到广泛应用。尽管这些模型擅长学习用户的正面兴趣,但针对从负面用户反馈中学习的研究相对较少。负面用户反馈是用户控制的重要手段,用户期望推荐器能快速响应并减少向用户推荐类似内容。然而,在序列检索模型的训练目标中,负面反馈信号常被忽略——这些模型主要旨在预测用户的正面交互。本文通过引入显式和隐式负面用户反馈,利用“不推荐”损失函数将其纳入序列推荐器检索阶段的训练目标,该函数优化了不推荐带有负面反馈项目的对数似然。通过在大规模工业推荐系统上的在线实验,我们验证了该方法的有效性。此外,为克服衡量推荐器对负面反馈响应性的挑战,我们开发了一个反事实模拟框架,用于比较不同用户行为下的推荐器响应变化,证明了模型改进后响应性的提升。