Many NLP or recommendation models begin by mapping discrete client inputs to embedding vectors. Since inputs can reveal sensitive information, the embedding step must be protected in privacy-preserving inference. Fully Homomorphic Encryption (FHE) enables inference over encrypted client data, but turns embedding lookup from simple table access into homomorphic computation. To keep the embedding table server-side and avoid transmitting encrypted embedding vectors from the client, we focus on server-side lookup: the client sends only a small encrypted index. Prior ICML 2024 work first builds a one-hot vector from the encrypted index before multiplying with the embedding table, and this one-hot generation is the dominant cost. One-hot-based methods are expensive in FHE: they construct a p-dimensional selection vector via an equality test for each coordinate, requiring $O(p \log p)$ total homomorphic operations. Our key observation is that private embedding lookup only requires a linearly independent representation of the encrypted index, not the one-hot basis itself. Building on it, we propose Independent Vector Evaluation (IVE). Instead of constructing a one-hot vector, IVE evaluates a linearly independent vector built from successive powers of a single encrypted value, reducing vector-generation cost to $O(p)$. It then recovers the same embedding vector via a precomputed change of basis, instantiated with an orthogonal Discrete Cosine Transform to mitigate error amplification. Our implementation shows IVE improves amortized lookup time by up to 78.4x over prior method. We further evaluate its impact on end-to-end encrypted FastText inference, where embedding lookup is a major cost in the shallow model. On Enron-Spam dataset, replacing one-hot generation with IVE reduces the share of vector generation in encrypted inference time from 99.6% to 66.3%.
翻译:许多NLP或推荐模型首先将离散的客户端输入映射为嵌入向量。由于输入可能泄露敏感信息,在隐私保护推理中必须保护嵌入步骤的安全性。全同态加密允许对加密客户端数据进行推理,但将嵌入查找从简单的表访问转变为同态计算。为使嵌入表保留在服务端并避免客户端传输加密嵌入向量,我们聚焦于服务端查找方案:客户端仅发送一个加密的小型索引。此前ICML 2024的工作先通过加密索引构建独热向量再与嵌入表相乘,其中独热向量的生成是主要计算开销。基于独热向量的方法在全同态加密中代价高昂:需要对每个坐标进行相等性测试构建p维选择向量,总计需要O(p log p)次同态操作。我们的核心发现是:私有嵌入查找仅需加密索引的线性无关表示,而非独热基本身。基于此,我们提出独立向量评估(IVE)方法。IVE不构建独热向量,而是通过单个加密值的连续幂次构建线性无关向量,将向量生成开销降至O(p)。随后通过预计算基变换(采用正交离散余弦变换以抑制误差放大)恢复原始嵌入向量。实验表明,IVE的摊销查找时间较先前方法提升高达78.4倍。我们进一步评估其在端到端加密FastText推理中的影响——嵌入查找是该浅层模型的主要计算瓶颈。在Enron-Spam数据集上,用IVE替代独热向量生成可将向量生成在加密推理时间中的占比从99.6%降至66.3%。