Expressive variational quantum circuits provide inherent privacy in federated learning

Federated learning has emerged as a viable distributed solution to train machine learning models without the actual need to share data with the central aggregator. However, standard neural network-based federated learning models have been shown to be susceptible to data leakage from the gradients shared with the server. In this work, we introduce federated learning with variational quantum circuit model built using expressive encoding maps coupled with overparameterized ans\"atze. We show that expressive maps lead to inherent privacy against gradient inversion attacks, while overparameterization ensures model trainability. Our privacy framework centers on the complexity of solving the system of high-degree multivariate Chebyshev polynomials generated by the gradients of quantum circuit. We present compelling arguments highlighting the inherent difficulty in solving these equations, both in exact and approximate scenarios. Additionally, we delve into machine learning-based attack strategies and establish a direct connection between overparameterization in the original federated learning model and underparameterization in the attack model. Furthermore, we provide numerical scaling arguments showcasing that underparameterization of the expressive map in the attack model leads to the loss landscape being swamped with exponentially many spurious local minima points, thus making it extremely hard to realize a successful attack. This provides a strong claim, for the first time, that the nature of quantum machine learning models inherently helps prevent data leakage in federated learning.

翻译：联邦学习已成为一种可行的分布式解决方案，用于训练机器学习模型，而无需实际与中央聚合器共享数据。然而，标准的基于神经网络的联邦学习模型已被证明容易因与服务器共享的梯度而导致数据泄露。在本工作中，我们引入了一种基于变分量子电路模型的联邦学习，该模型使用富有表现力的编码映射结合过参数化拟设构建。我们表明，富有表现力的映射能针对梯度反演攻击提供固有隐私，而过参数化则确保模型的可训练性。我们的隐私框架聚焦于求解由量子电路梯度生成的高阶多变量切比雪夫多项式系统的复杂性。我们提出了令人信服的论点，强调了在精确和近似场景中求解这些方程的固有难度。此外，我们深入探讨了基于机器学习的攻击策略，并建立了原始联邦学习模型中的过参数化与攻击模型中的欠参数化之间的直接联系。进一步地，我们提供了数值缩放论证，表明攻击模型中富有表现力映射的欠参数化导致损失景观充斥指数级多的伪局部最小值点，从而使得成功攻击变得极其困难。这首次提供了一个强有力的论断，即量子机器学习模型的本质特性有助于在联邦学习中防止数据泄露。