Recent research shows that large language models are susceptible to privacy attacks that infer aspects of the training data. However, it is unclear if simpler generative models, like topic models, share similar vulnerabilities. In this work, we propose an attack against topic models that can confidently identify members of the training data in Latent Dirichlet Allocation. Our results suggest that the privacy risks associated with generative modeling are not restricted to large neural models. Additionally, to mitigate these vulnerabilities, we explore differentially private (DP) topic modeling. We propose a framework for private topic modeling that incorporates DP vocabulary selection as a pre-processing step, and show that it improves privacy while having limited effects on practical utility.
翻译:近期研究表明,大型语言模型易受隐私攻击,此类攻击可推断训练数据的相关信息。然而,尚不明确更简单的生成模型(如主题模型)是否具有类似脆弱性。本文针对主题模型提出一种攻击方法,能够可靠地识别潜在狄利克雷分配模型中的训练数据成员。实验结果表明,生成建模带来的隐私风险不仅限于大型神经模型。此外,为缓解此类脆弱性,我们探索了差分隐私主题建模。我们提出了一种私人主题建模框架,将差分隐私词汇选择作为预处理步骤,并证明该方法在提升隐私保护的同时,对实际效用的影响有限。