Conversational recommendation systems (CRS) aim to timely and proactively acquire user dynamic preferred attributes through conversations for item recommendation. In each turn of CRS, there naturally have two decision-making processes with different roles that influence each other: 1) director, which is to select the follow-up option (i.e., ask or recommend) that is more effective for reducing the action space and acquiring user preferences; and 2) actor, which is to accordingly choose primitive actions (i.e., asked attribute or recommended item) that satisfy user preferences and give feedback to estimate the effectiveness of the director's option. However, existing methods heavily rely on a unified decision-making module or heuristic rules, while neglecting to distinguish the roles of different decision procedures, as well as the mutual influences between them. To address this, we propose a novel Director-Actor Hierarchical Conversational Recommender (DAHCR), where the director selects the most effective option, followed by the actor accordingly choosing primitive actions that satisfy user preferences. Specifically, we develop a dynamic hypergraph to model user preferences and introduce an intrinsic motivation to train from weak supervision over the director. Finally, to alleviate the bad effect of model bias on the mutual influence between the director and actor, we model the director's option by sampling from a categorical distribution. Extensive experiments demonstrate that DAHCR outperforms state-of-the-art methods.
翻译:会话推荐系统(CRS)旨在通过对话及时且主动地获取用户动态偏好属性,以进行物品推荐。在每个CRS轮次中,自然存在两个角色不同且相互影响的决策过程:1)导向者(director),用于选择更有效缩减动作空间和获取用户偏好的后续选项(即询问或推荐);2)执行者(actor),用于相应选择满足用户偏好的原始动作(即询问的属性或推荐的物品),并提供反馈以评估导向者选项的有效性。然而,现有方法严重依赖统一的决策模块或启发式规则,忽视了不同决策过程的角色区分以及它们之间的相互影响。为解决这一问题,我们提出了一种新颖的导向者-执行者分层会话推荐系统(DAHCR),其中导向者选择最有效的选项,随后执行者相应选择满足用户偏好的原始动作。具体而言,我们开发了一个动态超图来建模用户偏好,并引入内在动机在弱监督下训练导向者。最后,为缓解模型偏差对导向者与执行者之间相互影响的不利影响,我们通过从分类分布中采样来建模导向者的选项。大量实验表明,DAHCR优于当前最先进的方法。