Exploring the Distributed Knowledge Congruence in Proxy-data-free Federated Distillation

Federated learning (FL) is a privacy-preserving machine learning paradigm in which the server periodically aggregates local model parameters from clients without assembling their private data. Constrained communication and personalization requirements pose severe challenges to FL. Federated distillation (FD) is proposed to simultaneously address the above two problems, which exchanges knowledge between the server and clients, supporting heterogeneous local models while significantly reducing communication overhead. However, most existing FD methods require a proxy dataset, which is often unavailable in reality. A few recent proxy-data-free FD approaches can eliminate the need for additional public data, but suffer from remarkable discrepancy among local knowledge due to client-side model heterogeneity, leading to ambiguous representation on the server and inevitable accuracy degradation. To tackle this issue, we propose a proxy-data-free FD algorithm based on distributed knowledge congruence (FedDKC). FedDKC leverages well-designed refinement strategies to narrow local knowledge differences into an acceptable upper bound, so as to mitigate the negative effects of knowledge incongruence. Specifically, from perspectives of peak probability and Shannon entropy of local knowledge, we design kernel-based knowledge refinement (KKR) and searching-based knowledge refinement (SKR) respectively, and theoretically guarantee that the refined-local knowledge can satisfy an approximately-similar distribution and be regarded as congruent. Extensive experiments conducted on three common datasets demonstrate that our proposed FedDKC significantly outperforms the state-of-the-art on various heterogeneous settings while evidently improving the convergence speed.

翻译：联邦学习（FL）是一种隐私保护的机器学习范式，其中服务器定期聚合来自客户端的本地模型参数，而无需整合其私有数据。有限的通信和个性化需求给FL带来了严峻挑战。联邦蒸馏（FD）被提出以同时解决上述两个问题，它在服务器和客户端之间交换知识，支持异构本地模型，同时显著降低通信开销。然而，现有大多数FD方法需要代理数据集，这在现实中通常不可用。少数最近的无代理数据FD方法可以消除对额外公共数据的需求，但由于客户端侧模型异构性导致本地知识间存在显著差异，从而在服务器端产生模糊表示并不可避免地导致精度下降。为解决这一问题，我们提出了一种基于分布式知识一致性（FedDKC）的无代理数据FD算法。FedDKC利用精心设计的精炼策略将本地知识差异缩小到可接受的上界，从而减轻知识不一致带来的负面影响。具体而言，从本地知识的峰值概率和香农熵角度出发，我们分别设计了基于核的知识精炼（KKR）和基于搜索的知识精炼（SKR），并从理论上保证精炼后的本地知识能够满足近似相似的分布，可视为一致的。在三个常用数据集上进行的大量实验表明，我们提出的FedDKC在各种异构设置下显著优于现有最先进方法，同时明显提高了收敛速度。