Textual data is often represented as realnumbered embeddings in NLP, particularly with the popularity of large language models (LLMs) and Embeddings as a Service (EaaS). However, storing sensitive information as embeddings can be vulnerable to security breaches, as research shows that text can be reconstructed from embeddings, even without knowledge of the underlying model. While defence mechanisms have been explored, these are exclusively focused on English, leaving other languages vulnerable to attacks. This work explores LLM security through multilingual embedding inversion. We define the problem of black-box multilingual and cross-lingual inversion attacks, and thoroughly explore their potential implications. Our findings suggest that multilingual LLMs may be more vulnerable to inversion attacks, in part because English based defences may be ineffective. To alleviate this, we propose a simple masking defense effective for both monolingual and multilingual models. This study is the first to investigate multilingual inversion attacks, shedding light on the differences in attacks and defenses across monolingual and multilingual settings.
翻译:文本数据在自然语言处理中通常以实数嵌入向量的形式表示,尤其是在大型语言模型(LLMs)和嵌入即服务(EaaS)日益普及的背景下。然而,研究表明,即使不掌握底层模型的具体信息,文本仍可能从嵌入向量中被重构,因此以嵌入形式存储敏感信息容易遭受安全漏洞威胁。尽管已有防御机制被探索,但这些机制仅限于英语场景,使得其他语言仍面临攻击风险。本文通过多语言嵌入逆向攻击研究LLM的安全性。我们定义了黑盒多语言及跨语言逆向攻击问题,并深入探讨其潜在影响。研究结果表明,多语言LLM可能更容易遭受逆向攻击,部分原因在于基于英语的防御措施可能失效。为缓解这一风险,我们提出了一种适用于单语言和多语言模型的简单掩码防御方法。本研究首次对多语言逆向攻击展开系统性探索,揭示了单语言与多语言场景下攻击与防御的差异性。