Conversational recommender systems (CRSs) aim to understand the information needs and preferences expressed in a dialogue to recommend suitable items to the user. Most of the existing conversational recommendation datasets are synthesized or simulated with crowdsourcing, which has a large gap with real-world scenarios. To bridge the gap, previous work contributes a dataset E-ConvRec, based on pre-sales dialogues between users and customer service staff in E-commerce scenarios. However, E-ConvRec only supplies coarse-grained annotations and general tasks for making recommendations in pre-sales dialogues. Different from that, we use real user needs as a clue to explore the E-commerce conversational recommendation in complex pre-sales dialogues, namely user needs-centric E-commerce conversational recommendation (UNECR). In this paper, we construct a user needs-centric E-commerce conversational recommendation dataset (U-NEED) from real-world E-commerce scenarios. U-NEED consists of 3 types of resources: (i) 7,698 fine-grained annotated pre-sales dialogues in 5 top categories (ii) 333,879 user behaviors and (iii) 332,148 product knowledge tuples. To facilitate the research of UNECR, we propose 5 critical tasks: (i) pre-sales dialogue understanding (ii) user needs elicitation (iii) user needs-based recommendation (iv) pre-sales dialogue generation and (v) pre-sales dialogue evaluation. We establish baseline methods and evaluation metrics for each task. We report experimental results of 5 tasks on U-NEED. We also report results in 3 typical categories. Experimental results indicate that the challenges of UNECR in various categories are different.
翻译:对话推荐系统旨在理解对话中表达的信息需求和偏好,向用户推荐合适的商品。现有大多数对话推荐数据集通过众包方式合成或模拟,与真实场景存在较大差距。为弥合这一差距,先前工作基于电商场景中用户与客服的售前对话贡献了E-ConvRec数据集。然而,E-ConvRec仅提供粗粒度标注和面向售前对话推荐的通用任务。与此不同,我们以真实用户需求为线索,探索复杂售前对话中的电商对话推荐,即面向用户需求的电商对话推荐(UNECR)。本文从真实电商场景构建了面向用户需求的电商对话推荐数据集(U-NEED)。U-NEED包含三类资源:(i)覆盖5个头部类别的7,698条细粒度标注售前对话;(ii)333,879条用户行为数据;(iii)332,148条商品知识元组。为促进UNECR研究,我们提出五项关键任务:(i)售前对话理解;(ii)用户需求识别;(iii)基于用户需求的推荐;(iv)售前对话生成;(v)售前对话评估。我们为每项任务建立了基线方法和评估指标,并报告了U-NEED上的5项任务实验结果,同时给出了3个典型类别的结果。实验表明,不同类别中的UNECR挑战存在差异。