Cross-platform recommendation aims to improve recommendation accuracy by gathering heterogeneous features from different platforms. However, such cross-silo collaborations between platforms are restricted by increasingly stringent privacy protection regulations, thus data cannot be aggregated for training. Federated learning (FL) is a practical solution to deal with the data silo problem in recommendation scenarios. Existing cross-silo FL methods transmit model information to collaboratively build a global model by leveraging the data of overlapped users. However, in reality, the number of overlapped users is often very small, thus largely limiting the performance of such approaches. Moreover, transmitting model information during training requires high communication costs and may cause serious privacy leakage. In this paper, we propose a novel privacy-preserving double distillation framework named FedPDD for cross-silo federated recommendation, which efficiently transfers knowledge when overlapped users are limited. Specifically, our double distillation strategy enables local models to learn not only explicit knowledge from the other party but also implicit knowledge from its past predictions. Moreover, to ensure privacy and high efficiency, we employ an offline training scheme to reduce communication needs and privacy leakage risk. In addition, we adopt differential privacy to further protect the transmitted information. The experiments on two real-world recommendation datasets, HetRec-MovieLens and Criteo, demonstrate the effectiveness of FedPDD compared to the state-of-the-art approaches.
翻译:跨平台推荐旨在通过整合不同平台的异构特征来提高推荐精度。然而,此类平台间的跨孤岛协作受到日益严格的隐私保护法规限制,导致数据无法聚合进行训练。联邦学习(FL)是解决推荐场景中数据孤岛问题的实用方案。现有跨孤岛FL方法通过传输模型信息,利用重叠用户数据协同构建全局模型。但现实中重叠用户数量通常很少,极大限制了这类方法的性能。此外,训练过程中传输模型信息不仅需要高昂的通信成本,还可能引发严重的隐私泄露。本文提出一种新颖的隐私保护双重蒸馏框架FedPDD,用于跨孤岛联邦推荐,可在重叠用户有限时高效迁移知识。具体而言,我们的双重蒸馏策略使本地模型不仅能从对方平台学习显性知识,还能从其过往预测中学习隐性知识。此外,为保障隐私与高效性,我们采用离线训练方案来降低通信需求和隐私泄露风险,并引入差分隐私进一步保护传输信息。在HetRec-MovieLens和Criteo两个真实推荐数据集上的实验表明,FedPDD相较现有最优方法具有显著有效性。