In the evolving e-commerce field, recommendation systems crucially shape user experience and engagement. The rise of Consumer-to-Consumer (C2C) recommendation systems, noted for their flexibility and ease of access for customer vendors, marks a significant trend. However, the academic focus remains largely on Business-to-Consumer (B2C) models, leaving a gap filled by the limited C2C recommendation datasets that lack in item attributes, user diversity, and scale. The intricacy of C2C recommendation systems is further accentuated by the dual roles users assume as both sellers and buyers, introducing a spectrum of less uniform and varied inputs. Addressing this, we introduce MerRec, the first large-scale dataset specifically for C2C recommendations, sourced from the Mercari e-commerce platform, covering millions of users and products over 6 months in 2023. MerRec not only includes standard features such as user_id, item_id, and session_id, but also unique elements like timestamped action types, product taxonomy, and textual product attributes, offering a comprehensive dataset for research. This dataset, extensively evaluated across six recommendation tasks, establishes a new benchmark for the development of advanced recommendation algorithms in real-world scenarios, bridging the gap between academia and industry and propelling the study of C2C recommendations.
翻译:在持续发展的电子商务领域,推荐系统对塑造用户体验和参与度起着关键作用。消费者对消费者(C2C)推荐系统以其灵活性和对客户卖家的易用性著称,成为重要的发展趋势。然而,学术界的关注仍主要集中在商家对消费者(B2C)模式上,而现有的C2C推荐数据集在物品属性、用户多样性和规模上存在不足,导致这一领域存在研究空白。C2C推荐系统的复杂性还体现在用户同时扮演卖家和买家的双重角色,带来了更多样化的输入特征。针对这一问题,我们提出了MerRec——首个专门面向C2C推荐的大规模数据集,数据来源于Mercari电子商务平台,涵盖了2023年6个月内数百万用户和产品。MerRec不仅包含用户ID、物品ID、会话ID等标准特征,还引入了带时间戳的动作类型、产品分类体系、文本化产品属性等独特特征,为研究提供了全面的数据集。该数据集在六项推荐任务中经过广泛评估,为真实场景下先进推荐算法的研发建立了新基准,填补了学术界与工业界之间的鸿沟,推动了C2C推荐研究的发展。