Federated learning works by aggregating locally computed gradients from multiple clients, thus enabling collaborative training without sharing private client data. However, prior work has shown that the data can actually be recovered by the server using so-called gradient inversion attacks. While these attacks perform well when applied on images, they are limited in the text domain and only permit approximate reconstruction of small batches and short input sequences. In this work, we propose DAGER, the first algorithm to recover whole batches of input text exactly. DAGER leverages the low-rank structure of self-attention layer gradients and the discrete nature of token embeddings to efficiently check if a given token sequence is part of the client data. We use this check to exactly recover full batches in the honest-but-curious setting without any prior on the data for both encoder- and decoder-based architectures using exhaustive heuristic search and a greedy approach, respectively. We provide an efficient GPU implementation of DAGER and show experimentally that it recovers full batches of size up to 128 on large language models (LLMs), beating prior attacks in speed (20x at same batch size), scalability (10x larger batches), and reconstruction quality (ROUGE-1/2 > 0.99).
翻译:联邦学习通过聚合来自多个客户端的本地计算梯度来实现协作训练,而无需共享私有客户端数据。然而,先前的研究表明,服务器实际上可以通过所谓的梯度反演攻击来恢复数据。尽管这些攻击在应用于图像时表现良好,但在文本领域存在局限,仅能对小批量数据和短输入序列进行近似重建。在本研究中,我们提出了DAGER,这是首个能够精确恢复整批输入文本的算法。DAGER利用自注意力层梯度的低秩结构和词元嵌入的离散特性,高效地检查给定的词元序列是否属于客户端数据。我们基于这一检查机制,在诚实但好奇的场景下,分别通过穷举启发式搜索和贪心策略,对基于编码器和解码器的架构实现了无需任何数据先验的完整批次精确恢复。我们提供了DAGER的高效GPU实现,并通过实验证明其能够在大型语言模型上恢复多达128个样本的完整批次,在速度(相同批次大小下提升20倍)、可扩展性(批次规模扩大10倍)和重建质量(ROUGE-1/2 > 0.99)方面均超越了现有攻击方法。