AI assistants are becoming an integral part of society, used for asking advice or help in personal and confidential issues. In this paper, we unveil a novel side-channel that can be used to read encrypted responses from AI Assistants over the web: the token-length side-channel. We found that many vendors, including OpenAI and Microsoft, have this side-channel. However, inferring the content of a response from a token-length sequence alone proves challenging. This is because tokens are akin to words, and responses can be several sentences long leading to millions of grammatically correct sentences. In this paper, we show how this can be overcome by (1) utilizing the power of a large language model (LLM) to translate these sequences, (2) providing the LLM with inter-sentence context to narrow the search space and (3) performing a known-plaintext attack by fine-tuning the model on the target model's writing style. Using these methods, we were able to accurately reconstruct 29\% of an AI assistant's responses and successfully infer the topic from 55\% of them. To demonstrate the threat, we performed the attack on OpenAI's ChatGPT-4 and Microsoft's Copilot on both browser and API traffic.
翻译:AI助手正逐渐成为社会不可或缺的一部分,用于在个人及私密问题上寻求建议或帮助。本文揭示了一种新颖的侧信道攻击手段,能够通过网络读取来自AI助手的加密响应:即令牌长度侧信道。我们发现,包括OpenAI和微软在内的多家供应商均存在此侧信道漏洞。然而,仅凭令牌长度序列推断响应内容颇具挑战性,因为令牌类似于单词,而响应可能包含多个句子,产生数百万种语法正确的句子组合。本文展示了如何通过以下方式克服这一难题:(1)利用大语言模型(LLM)的强大能力翻译这些序列;(2)为LLM提供句间语境以缩小搜索空间;(3)通过对目标模型的写作风格进行微调,实施已知明文攻击。采用这些方法,我们能够准确重建AI助手29%的响应内容,并成功推断其中55%响应的主题。为验证威胁的真实性,我们针对OpenAI的ChatGPT-4及微软的Copilot,在其浏览器与API流量上实施了该攻击。