Large Language Models have shown great success in recommender systems. However, the limited and sparse nature of user data often restricts the LLM's ability to effectively model behavior patterns. To address this, existing studies have explored cross-domain solutions by conducting Cross-Domain Recommendation tasks. But previous methods typically assume domains are overlapped and can be accessed readily. None of the LLM methods address the privacy-preserving issues in the CDR settings, that is, Privacy-Preserving Cross-Domain Recommendation. Conducting non-overlapping PPCDR with LLM is challenging since: 1)The inability to share user identity or behavioral data across domains impedes effective cross-domain alignment. 2)The heterogeneity of data modalities across domains complicates knowledge integration. 3)Fusing collaborative filtering signals from traditional recommendation models with LLMs is difficult, as they operate within distinct feature spaces. To address the above issues, we propose SF-UBM, a Semantic-enhanced Federated User Behavior Modeling method. Specifically, to deal with Challenge 1, we leverage natural language as a universal bridge to connect disjoint domains via a semantic-enhanced federated architecture. Here, text-based item representations are encrypted and shared, while user-specific data remains local. To handle Challenge 2, we design a Fact-counter Knowledge Distillation module to integrate domain-agnostic knowledge with domain-specific knowledge, across different data modalities. To tackle Challenge 3, we project pre-learned user preferences and cross-domain item representations into the soft prompt space, aligning behavioral and semantic spaces for effective LLM learning. We conduct extensive experiments on three pairs of real-world domains, and the experimental results demonstrate the effectiveness of SF-UBM compared to the recent SOTA methods.
翻译:大语言模型在推荐系统中已展现出卓越成效。然而,用户数据的有限性与稀疏性常限制大语言模型有效建模行为模式的能力。为解决此问题,现有研究探索了跨域解决方案,开展了跨域推荐任务。但既往方法通常假设领域存在重叠且可便捷访问,尚无大语言模型方法能应对跨域推荐场景中的隐私保护问题,即隐私保护跨域推荐。使用大语言模型实施非重叠隐私保护跨域推荐面临三大挑战:1)跨域无法共享用户身份或行为数据,阻碍了有效的跨域对齐;2)跨域数据模态的异构性增加了知识整合难度;3)传统推荐模型的协同过滤信号与大语言模型分属不同特征空间,难以融合。针对上述问题,本文提出SF-UBM——一种语义增强的联邦用户行为建模方法。具体而言,应对挑战1时,我们利用自然语言作为通用桥梁,通过语义增强的联邦架构连接非重叠领域。在此框架中,基于文本的项目表征经加密后共享,而用户特定数据保留本地。应对挑战2时,我们设计事实反演知识蒸馏模块,在不同数据模态间整合领域无关知识与领域特定知识。应对挑战3时,我们将预训练的用户偏好与跨域项目表征投影至软提示空间,对齐行为空间与语义空间以实现高效大语言模型学习。我们在三组真实跨域数据集上开展大量实验,结果表明SF-UBM相较当前最优方法具有显著有效性。