Addressing Missing Data Issue for Diffusion-based Recommendation

Diffusion models have shown significant potential in generating oracle items that best match user preference with guidance from user historical interaction sequences. However, the quality of guidance is often compromised by unpredictable missing data in observed sequence, leading to suboptimal item generation. Since missing data is uncertain in both occurrence and content, recovering it is impractical and may introduce additional errors. To tackle this challenge, we propose a novel dual-side Thompson sampling-based Diffusion Model (TDM), which simulates extra missing data in the guidance signals and allows diffusion models to handle existing missing data through extrapolation. To preserve user preference evolution in sequences despite extra missing data, we introduce Dual-side Thompson Sampling to implement simulation with two probability models, sampling by exploiting user preference from both item continuity and sequence stability. TDM strategically removes items from sequences based on dual-side Thompson sampling and treats these edited sequences as guidance for diffusion models, enhancing models' robustness to missing data through consistency regularization. Additionally, to enhance the generation efficiency, TDM is implemented under the denoising diffusion implicit models to accelerate the reverse process. Extensive experiments and theoretical analysis validate the effectiveness of TDM in addressing missing data in sequential recommendations.

翻译：扩散模型在生成与用户偏好最佳匹配的预言项目方面展现出巨大潜力，其依据是用户历史交互序列的引导。然而，观测序列中不可预测的缺失数据往往会损害引导信号的质量，导致生成的项目次优。由于缺失数据在发生时机和内容上均具有不确定性，对其进行恢复既不切实际，也可能引入额外误差。为应对这一挑战，我们提出了一种新颖的基于双端汤普森采样的扩散模型（TDM）。该模型通过在引导信号中模拟额外的缺失数据，使扩散模型能够通过外推法处理已有的缺失数据。为了在引入额外缺失数据的同时保持序列中用户偏好的演化趋势，我们引入了双端汤普森采样，利用两个概率模型进行模拟采样，分别从项目连续性和序列稳定性两个维度挖掘用户偏好。TDM 基于双端汤普森采样策略性地从序列中移除项目，并将这些编辑后的序列作为扩散模型的引导信号，通过一致性正则化增强模型对缺失数据的鲁棒性。此外，为提高生成效率，TDM 在去噪扩散隐式模型框架下实现，以加速反向过程。大量的实验和理论分析验证了 TDM 在解决序列推荐中缺失数据问题上的有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日