Automated Data Denoising for Recommendation

In real-world scenarios, most platforms collect both large-scale, naturally noisy implicit feedback and small-scale yet highly relevant explicit feedback. Due to the issue of data sparsity, implicit feedback is often the default choice for training recommender systems (RS), however, such data could be very noisy due to the randomness and diversity of user behaviors. For instance, a large portion of clicks may not reflect true user preferences and many purchases may result in negative reviews or returns. Fortunately, by utilizing the strengths of both types of feedback to compensate for the weaknesses of the other, we can mitigate the above issue at almost no cost. In this work, we propose an Automated Data Denoising framework, \textbf{\textit{AutoDenoise}}, for recommendation, which uses a small number of explicit data as validation set to guide the recommender training. Inspired by the generalized definition of curriculum learning (CL), AutoDenoise learns to automatically and dynamically assign the most appropriate (discrete or continuous) weights to each implicit data sample along the training process under the guidance of the validation performance. Specifically, we use a delicately designed controller network to generate the weights, combine the weights with the loss of each input data to train the recommender system, and optimize the controller with reinforcement learning to maximize the expected accuracy of the trained RS on the noise-free validation set. Thorough experiments indicate that AutoDenoise is able to boost the performance of the state-of-the-art recommendation algorithms on several public benchmark datasets.

翻译：现实场景中，多数平台既收集大规模但存在自然噪声的隐式反馈数据，也收集规模较小但相关性极高的显式反馈数据。由于数据稀疏性问题，隐式反馈常被选为训练推荐系统的默认方案，但此类数据因用户行为的随机性和多样性可能含有大量噪声。例如，大量点击行为未必反映用户真实偏好，许多购买行为可能最终导致差评或退货。幸运的是，通过利用两类反馈的优势互补，我们几乎可以在零成本条件下缓解上述问题。本文提出面向推荐系统的自动化数据去噪框架\textbf{\textit{AutoDenoise}}，该框架将少量显式数据作为验证集以指导推荐模型训练。受课程学习广义定义启发，AutoDenoise能够在训练过程中，基于验证集表现动态自动为每个隐式数据样本分配最合适的（离散或连续）权重。具体而言，我们通过精心设计的控制器网络生成权重，将这些权重与各输入数据的损失函数结合来训练推荐系统，并采用强化学习优化控制器，以最大化训练后的推荐系统在无噪声验证集上的期望准确率。全面实验表明，AutoDenoise能在多个公开基准数据集上显著提升现有最优推荐算法的性能。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日