Cross-platform recommendation aims to improve recommendation accuracy by gathering heterogeneous features from different platforms. However, such cross-silo collaborations between platforms are restricted by increasingly stringent privacy protection regulations, thus data cannot be aggregated for training. Federated learning (FL) is a practical solution to deal with the data silo problem in recommendation scenarios. Existing cross-silo FL methods transmit model information to collaboratively build a global model by leveraging the data of overlapped users. However, in reality, the number of overlapped users is often very small, thus largely limiting the performance of such approaches. Moreover, transmitting model information during training requires high communication costs and may cause serious privacy leakage. In this paper, we propose a novel privacy-preserving double distillation framework named FedPDD for cross-silo federated recommendation, which efficiently transfers knowledge when overlapped users are limited. Specifically, our double distillation strategy enables local models to learn not only explicit knowledge from the other party but also implicit knowledge from its past predictions. Moreover, to ensure privacy and high efficiency, we employ an offline training scheme to reduce communication needs and privacy leakage risk. In addition, we adopt differential privacy to further protect the transmitted information. The experiments on two real-world recommendation datasets, HetRec-MovieLens and Criteo, demonstrate the effectiveness of FedPDD compared to the state-of-the-art approaches.
翻译:跨平台推荐旨在通过聚合不同平台的异构特征提升推荐准确性。然而,日益严格的隐私保护法规限制了平台间的此类跨孤岛协作,导致数据无法集中训练。联邦学习(FL)是解决推荐场景中数据孤岛问题的可行方案。现有跨孤岛联邦学习方法通过利用重叠用户数据传递模型信息,协同构建全局模型。但在实际场景中,重叠用户数量往往极少,极大限制了此类方法的性能。此外,训练过程中传递模型信息需要高昂的通信成本,且可能导致严重的隐私泄露。本文提出一种名为FedPDD的新型隐私保护双重蒸馏框架用于跨孤岛联邦推荐,该框架能在重叠用户有限时高效传递知识。具体而言,我们的双重蒸馏策略使本地模型不仅能从对方平台学习显性知识,还能从其历史预测中学习隐性知识。同时,为保障隐私与高效性,我们采用离线训练方案降低通信需求与隐私泄露风险,并引入差分隐私进一步保护传输信息。在HetRec-MovieLens和Criteo两个真实推荐数据集上的实验表明,FedPDD相比现有最优方法具有显著有效性。