Conversational recommendation systems (CRS) commonly assume users have clear preferences, leading to potential over-filtering of relevant alternatives. However, users often exhibit vague, non-binary preferences. We introduce the Vague Preference Multi-round Conversational Recommendation (VPMCR) scenario, employing a soft estimation mechanism to accommodate users' vague and dynamic preferences while mitigating over-filtering. In VPMCR, we propose Vague Preference Policy Learning (VPPL), consisting of Ambiguity-aware Soft Estimation (ASE) and Dynamism-aware Policy Learning (DPL). ASE captures preference vagueness by estimating scores for clicked and non-clicked options, using a choice-based approach and time-aware preference decay. DPL leverages ASE's preference distribution to guide the conversation and adapt to preference changes for recommendations or attribute queries. Extensive experiments demonstrate VPPL's effectiveness within VPMCR, outperforming existing methods and setting a new benchmark. Our work advances CRS by accommodating users' inherent ambiguity and relative decision-making processes, improving real-world applicability.
翻译:对话推荐系统通常假设用户具有明确的偏好,这可能导致对相关选项的过度筛选。然而,用户往往表现出模糊、非二元的偏好。本文提出了模糊偏好多轮对话推荐场景,采用软估计机制以适应用户模糊且动态的偏好,同时缓解过度筛选问题。在该场景中,我们提出了模糊偏好策略学习方法,其包含模糊感知软估计模块与动态感知策略学习模块。模糊感知软估计模块通过基于选择的机制和时间感知的偏好衰减,对点击与未点击选项进行评分估计,从而捕捉偏好模糊性。动态感知策略学习模块利用模糊感知软估计模块生成的偏好分布来引导对话,并适应偏好变化以进行推荐或属性查询。大量实验证明了该方法在模糊偏好多轮对话推荐场景中的有效性,其性能优于现有方法并设立了新的基准。本研究通过适应用户固有的模糊性与相对决策过程,提升了对话推荐系统的现实适用性,推动了该领域的发展。