Federated learning is a framework for collaborative machine learning where clients only share gradient updates and not their private data with a server. However, it was recently shown that gradient inversion attacks can reconstruct this data from the shared gradients. In the important honest-but-curious setting, existing attacks enable exact reconstruction only for batch size of $b=1$, with larger batches permitting only approximate reconstruction. In this work, we propose SPEAR, the first algorithm reconstructing whole batches with $b >1$ exactly. SPEAR combines insights into the explicit low-rank structure of gradients with a sampling-based algorithm. Crucially, we leverage ReLU-induced gradient sparsity to precisely filter out large numbers of incorrect samples, making a final reconstruction step tractable. We provide an efficient GPU implementation for fully connected networks and show that it recovers high-dimensional ImageNet inputs in batches of up to $b \lesssim 25$ exactly while scaling to large networks. Finally, we show theoretically that much larger batches can be reconstructed with high probability given exponential time.
翻译:联邦学习是一种协作式机器学习框架,其中客户端仅与服务器共享梯度更新而非私有数据。然而,近期研究表明梯度反演攻击可从共享梯度中重构原始数据。在重要的诚实但好奇场景下,现有攻击仅能对批次大小$b=1$实现精确重构,对于更大批次仅能进行近似重构。本研究提出SPEAR算法,首次实现对$b >1$的完整批次进行精确重构。SPEAR结合了对梯度显式低秩结构的理论洞察与基于采样的算法。关键创新在于利用ReLU激活函数引起的梯度稀疏性,精确滤除大量错误样本,使最终重构步骤可解。我们为全连接网络提供了高效的GPU实现,证明该算法能精确恢复高维ImageNet输入(批次大小$b \lesssim 25$),并可扩展至大型网络。最后,我们从理论上证明,在指数级时间成本下,能以高概率重构更大规模的批次数据。