Contrastive Pre-training for Deep Session Data Understanding

Session data has been widely used for understanding user's behavior in e-commerce. Researchers are trying to leverage session data for different tasks, such as purchase intention prediction, remaining length prediction, recommendation, etc., as it provides context clues about the user's dynamic interests. However, online shopping session data is semi-structured and complex in nature, which contains both unstructured textual data about the products, search queries, and structured user action sequences. Most existing works focus on leveraging the coarse-grained item sequences for specific tasks, while largely ignore the fine-grained information from text and user action details. In this work, we delve into deep session data understanding via scrutinizing the various clues inside the rich information in user sessions. Specifically, we propose to pre-train a general-purpose User Behavior Model (UBM) over large-scale session data with rich details, such as product title, attributes and various kinds of user actions. A two-stage pre-training scheme is introduced to encourage the model to self-learn from various augmentations with contrastive learning objectives, which spans different granularity levels of session data. Then the well-trained session understanding model can be easily fine-tuned for various downstream tasks. Extensive experiments show that UBM better captures the complex intra-item semantic relations, inter-item connections and inter-interaction dependencies, leading to large performance gains as compared to the baselines on several downstream tasks. And it also demonstrates strong robustness when data is sparse.

翻译：会话数据已被广泛用于理解电子商务中的用户行为。研究者尝试利用会话数据完成不同任务，例如购买意图预测、剩余时长预测、推荐等，因为其提供了用户动态兴趣的上下文线索。然而，在线购物会话数据本质上是半结构化且复杂的，既包含关于产品、搜索查询的非结构化文本数据，也包含结构化用户动作序列。现有工作大多聚焦于利用粗粒度物品序列完成特定任务，而很大程度上忽略了来自文本和用户动作细节的细粒度信息。本研究通过深入分析用户会话丰富信息中的各种线索，探索深度会话数据理解。具体而言，我们提出在大规模包含丰富细节（如产品标题、属性及各类用户动作）的会话数据上预训练通用用户行为模型（UBM）。引入两阶段预训练方案，通过对比学习目标鼓励模型从不同数据增强方式中自学习，覆盖会话数据的多粒度层次。经过充分训练的会话理解模型可轻松微调以适应各类下游任务。大量实验表明，UBM能够更好地捕捉复杂的物品内语义关系、物品间关联及交互间依赖关系，在多个下游任务中相较于基线方法取得显著性能提升。同时，该模型在数据稀疏情况下也展现出强鲁棒性。