Cross-domain recommender (CDR) systems aim to transfer knowledge from data-rich domains to data-sparse ones, alleviating sparsity and cold-start issues present in conventional single-domain recommenders. However, many CDR approaches rely on overlapping users or items to establish explicit cross-domain connections, which is unrealistic in practice. Moreover, most methods represent user preferences as fixed discrete vectors, limiting their ability to capture the fine-grained and multi-aspect nature of user interests. To address these limitations, we propose DUP-OT (Distributional User Preferences with Optimal Transport), a novel framework for non-overlapping CDR. DUP-OT consists of three stages: (1) a shared preprocessing module that extracts review-based embeddings using a unified sentence encoder and autoencoder; (2) a user preference modeling module that represents each user's interests as a Gaussian Mixture Model (GMM) over item embeddings; and (3) an optimal-transport-based alignment module that matches Gaussian components across domains, enabling effective preference transfer for target-domain rating prediction. Experiments on Amazon Review datasets show that DUP-OT outperforms single-domain baselines even without source-domain data, and achieves lower RMSE than the cross-domain baseline TDAR under strictly non-overlapping training settings, demonstrating its effectiveness in reducing large prediction errors for cold-start users. The implementation is available at https://github.com/XiaoZY2000/dup-ot.
翻译:跨域推荐系统旨在将知识从数据丰富的领域迁移至数据稀疏的领域,以缓解传统单域推荐系统中存在的数据稀疏性和冷启动问题。然而,许多跨域推荐方法依赖重叠的用户或物品来建立显式的跨域关联,这在实际应用中往往不现实。此外,现有方法大多将用户偏好表示为固定的离散向量,限制了其捕捉用户兴趣细粒度与多维度特征的能力。为克服这些局限,本文提出DUP-OT(基于最优传输的分布化用户偏好建模框架),一种面向非重叠跨域推荐的新型框架。DUP-OT包含三个阶段:(1)共享预处理模块:使用统一的句子编码器与自编码器提取基于评论的嵌入表示;(2)用户偏好建模模块:将每个用户的兴趣表示为基于物品嵌入的高斯混合模型;(3)基于最优传输的对齐模块:通过匹配跨域的高斯分量实现偏好迁移,从而支持目标域评分预测。在亚马逊评论数据集上的实验表明,即使不使用源域数据,DUP-OT仍优于单域基线方法;在严格非重叠的训练设置下,其均方根误差低于跨域基线TDAR,证明了该框架在降低冷启动用户预测误差方面的有效性。代码实现已发布于 https://github.com/XiaoZY2000/dup-ot。