Federated learning (FL) provides an emerging approach for collaboratively training semantic encoder/decoder models of semantic communication systems, without private user data leaving the devices. Most existing studies on trustworthy FL aim to eliminate data poisoning threats that are produced by malicious clients, but in many cases, eliminating model poisoning attacks brought by fake servers is also an important objective. In this paper, a certificateless authentication-based trustworthy federated learning (CATFL) framework is proposed, which mutually authenticates the identity of clients and server. In CATFL, each client verifies the server's signature information before accepting the delivered global model to ensure that the global model is not delivered by false servers. On the contrary, the server also verifies the server's signature information before accepting the delivered model updates to ensure that they are submitted by authorized clients. Compared to PKI-based methods, the CATFL can avoid too high certificate management overheads. Meanwhile, the anonymity of clients shields data poisoning attacks, while real-name registration may suffer from user-specific privacy leakage risks. Therefore, a pseudonym generation strategy is also presented in CATFL to achieve a trade-off between identity traceability and user anonymity, which is essential to conditionally prevent from user-specific privacy leakage. Theoretical security analysis and evaluation results validate the superiority of CATFL.
翻译:联邦学习(FL)为协作训练语义通信系统的语义编码器/解码器模型提供了一种新兴方法,可确保用户私有数据不离开设备。现有可信联邦学习研究大多旨在消除恶意客户端产生的数据投毒威胁,但在许多场景中,消除虚假服务器带来的模型投毒攻击同样重要。本文提出一种基于无证书认证的可信联邦学习框架(CATFL),该框架可实现客户端与服务器的双向身份认证。在CATFL中,每个客户端在接受下发的全局模型前需验证服务器签名信息,确保全局模型并非由虚假服务器下发;反之,服务器在接受下发的模型更新前也需验证客户端签名信息,确保更新由授权客户端提交。相较于基于PKI的方法,CATFL可避免过高的证书管理开销。同时,客户端匿名性可屏蔽数据投毒攻击,而实名注册则可能面临用户特定隐私泄露风险。为此,CATFL还提出一种伪名生成策略,在身份可追溯性与用户匿名性之间取得平衡,这对有条件地防范用户特定隐私泄露至关重要。理论安全分析与评估结果验证了CATFL的优越性。