Federated learning works by aggregating locally computed gradients from multiple clients, thus enabling collaborative training without sharing private client data. However, prior work has shown that the data can actually be recovered by the server using so-called gradient inversion attacks. While these attacks perform well when applied on images, they are limited in the text domain and only permit approximate reconstruction of small batches and short input sequences. In this work, we propose DAGER, the first algorithm to recover whole batches of input text exactly. DAGER leverages the low-rank structure of self-attention layer gradients and the discrete nature of token embeddings to efficiently check if a given token sequence is part of the client data. We use this check to exactly recover full batches in the honest-but-curious setting without any prior on the data for both encoder- and decoder-based architectures using exhaustive heuristic search and a greedy approach, respectively. We provide an efficient GPU implementation of DAGER and show experimentally that it recovers full batches of size up to 128 on large language models (LLMs), beating prior attacks in speed (20x at same batch size), scalability (10x larger batches), and reconstruction quality (ROUGE-1/2 > 0.99).
翻译:联邦学习通过聚合来自多个客户端的本地计算梯度来实现协同训练,而无需共享私有客户端数据。然而,已有研究表明,服务器实际上可以通过所谓的梯度反演攻击来恢复数据。尽管此类攻击在图像领域效果显著,但在文本领域存在局限,仅能对小批量数据和短输入序列进行近似重建。本文提出DAGER算法,首次实现了对整批输入文本的精确恢复。DAGER利用自注意力层梯度的低秩结构及词元嵌入的离散特性,可高效验证给定词元序列是否属于客户端数据。基于此验证机制,我们在诚实但好奇的场景下,分别通过穷举启发式搜索和贪心策略,对编码器与解码器架构实现了无需数据先验的完整批次精确恢复。我们提供了DAGER的高效GPU实现,实验表明该算法可在大型语言模型上恢复多达128个样本的完整批次,在速度(同批次规模下提升20倍)、可扩展性(批次规模扩大10倍)与重建质量(ROUGE-1/2 > 0.99)方面均超越现有攻击方法。