Cross-Domain Recommendation (CDR) seeks to enable effective knowledge transfer across domains. Existing works rely on either representation alignment or transformation bridges, but they struggle on identifying domain-shared from domain-specific latent factors. Specifically, while CDR describes user representations as a joint distribution over two domains, these methods fail to account for its joint identifiability as they primarily fixate on the marginal distribution within a particular domain. Such a failure may overlook the conditionality between two domains and how it contributes to latent factor disentanglement, leading to negative transfer when domains are weakly correlated. In this study, we explore what should and should not be transferred in cross-domain user representations from a causality perspective. We propose a Hierarchical subspace disentanglement approach to explore the Joint IDentifiability of cross-domain joint distribution, termed HJID, to preserve domain-specific behaviors from domain-shared factors. HJID organizes user representations into layers: generic shallow subspaces and domain-oriented deep subspaces. We first encode the generic pattern in the shallow subspace by minimizing the Maximum Mean Discrepancy of initial layer activation. Then, to dissect how domain-oriented latent factors are encoded in deeper layers activation, we construct a cross-domain causality-based data generation graph, which identifies cross-domain consistent and domain-specific components, adhering to the Minimal Change principle. This allows HJID to maintain stability whilst discovering unique factors for different domains, all within a generative framework of invertible transformations that guarantee the joint identifiability. With experiments on real-world datasets, we show that HJID outperforms SOTA methods on a range of strongly and weakly correlated CDR tasks.
翻译:跨域推荐旨在实现跨领域的有效知识迁移。现有方法依赖于表示对齐或转换桥接,但在区分跨域共享与领域特定潜在因素方面存在困难。具体而言,尽管跨域推荐将用户表示描述为两个领域的联合分布,但这些方法主要关注特定领域内的边际分布,而未能考虑其联合可识别性。这种缺失可能导致忽略两个领域间的条件性及其对潜在因素解耦的贡献,从而在领域弱相关时引发负迁移。本研究从因果关系视角探讨跨域用户表示中哪些内容应当或不应被迁移。我们提出一种层次子空间解耦方法来探索跨域联合分布的联合可识别性(简称HJID),以从共享因素中保留领域特定行为。HJID将用户表示组织为层次结构:通用浅层子空间和领域定向深层子空间。首先,通过最小化初始层激活的最大均值差异,在浅层子空间中编码通用模式。然后,为剖析领域定向潜在因素如何在深层激活中被编码,我们构建了一个基于跨域因果关系的数椐生成图,该图依据最小变化原则识别跨域一致成分和领域特定成分。这使得HJID能够在可逆变换的生成框架中,既保持稳定性又发现不同领域的独特因素,从而保证联合可识别性。在真实数据集上的实验表明,HJID在强相关和弱相关的跨域推荐任务中均优于现有最先进方法。