Pre-training on large models is prevalent and emerging with the ever-growing user-generated content in many machine learning application categories. It has been recognized that learning contextual knowledge from the datasets depicting user-content interaction plays a vital role in downstream tasks. Despite several studies attempting to learn contextual knowledge via pre-training methods, finding an optimal training objective and strategy for this type of task remains a challenging problem. In this work, we contend that there are two distinct aspects of contextual knowledge, namely the user-side and the content-side, for datasets where user-content interaction can be represented as a bipartite graph. To learn contextual knowledge, we propose a pre-training method that learns a bi-directional mapping between the spaces of the user-side and the content-side. We formulate the training goal as a contrastive learning task and propose a dual-Transformer architecture to encode the contextual knowledge. We evaluate the proposed method for the recommendation task. The empirical studies have demonstrated that the proposed method outperformed all the baselines with significant gains.
翻译:预训练在大规模模型上已普遍存在,并随着众多机器学习应用类别中用户生成内容的持续增长而日益兴起。从描述用户-内容交互的数据集中学习上下文知识对下游任务至关重要。尽管已有研究尝试通过预训练方法获取上下文知识,但为此类任务寻找最优训练目标与策略仍是具有挑战性的问题。本研究认为,对于可将用户-内容交互表示为二分图的数据集,上下文知识存在两个不同维度,即用户侧与内容侧。为学习此类上下文知识,我们提出一种预训练方法,该方法可学习用户侧与内容侧空间间的双向映射。我们将训练目标表述为对比学习任务,并提出双Transformer架构以编码上下文知识。我们在推荐任务上评估了所提方法。实证研究表明,所提方法显著优于所有基线方法。