As users increasingly interact with large language models (LLMs) using private information, secure and encrypted communication becomes essential. Homomorphic encryption (HE) provides a principled solution by enabling computation directly on encrypted data. Although prior work has explored aspects of running LLMs under HE, the challenge of text generation, particularly next-token prediction, has received limited attention and remains a key obstacle to practical encrypted interaction. In this work, we propose a TSP-based token reordering strategy to address the difficulties of encrypted text generation, together with a post-processing step that further reduces approximation error. Theoretical analysis and experimental results demonstrate that our method prevents collapse, improves coherence in generated text, and preserves data privacy throughout. Overall, our contributions advance the feasibility of practical and privacy-preserving LLM inference.
翻译:随着用户越来越多地使用私人信息与大型语言模型(LLM)交互,安全加密通信变得至关重要。同态加密(HE)通过支持直接在加密数据上进行计算,提供了一种原理性的解决方案。尽管先前的研究已探索了在HE下运行LLM的若干方面,但文本生成——特别是下一词元预测——所面临的挑战受到的关注有限,仍然是实现实用加密交互的关键障碍。在本工作中,我们提出了一种基于旅行商问题(TSP)的词元重排序策略,以应对加密文本生成的困难,并辅以一个进一步减少近似误差的后处理步骤。理论分析与实验结果表明,我们的方法能够防止模型崩溃,提升生成文本的连贯性,并在全过程中保持数据隐私。总体而言,我们的贡献推动了实用且保护隐私的LLM推理的可行性。