Recently, powerful Large Language Models (LLMs) have become easily accessible to hundreds of millions of users world-wide. However, their strong capabilities and vast world knowledge do not come without associated privacy risks. In this work, we focus on the emerging privacy threat LLMs pose -- the ability to accurately infer personal information from online texts. Despite the growing importance of LLM-based author profiling, research in this area has been hampered by a lack of suitable public datasets, largely due to ethical and privacy concerns associated with real personal data. We take two steps to address this problem: (i) we construct a simulation framework for the popular social media platform Reddit using LLM agents seeded with synthetic personal profiles; (ii) using this framework, we generate SynthPAI, a diverse synthetic dataset of over 7800 comments manually labeled for personal attributes. We validate our dataset with a human study showing that humans barely outperform random guessing on the task of distinguishing our synthetic comments from real ones. Further, we verify that our dataset enables meaningful personal attribute inference research by showing across 18 state-of-the-art LLMs that our synthetic comments allow us to draw the same conclusions as real-world data. Combined, our experimental results, dataset and pipeline form a strong basis for future privacy-preserving research geared towards understanding and mitigating inference-based privacy threats that LLMs pose.
翻译:近年来,功能强大的大型语言模型(LLMs)已为全球数亿用户便捷使用。然而,其强大的能力与广泛的世界知识并非没有伴随的隐私风险。本研究聚焦于LLMs带来的新兴隐私威胁——从在线文本中准确推断个人信息的能力。尽管基于LLM的作者画像分析日益重要,但该领域的研究一直因缺乏合适的公开数据集而受阻,这主要源于真实个人数据所涉及的伦理与隐私问题。我们采取两个步骤来解决此问题:(i)利用基于合成个人档案生成的LLM智能体,为流行的社交媒体平台Reddit构建仿真框架;(ii)通过该框架,我们生成了SynthPAI——一个包含7800余条人工标注个人属性的评论的多样化合成数据集。我们通过人工研究验证了数据集的有效性,结果表明人类在区分合成评论与真实评论的任务中表现仅略优于随机猜测。此外,我们验证了该数据集能够支持有意义的个人属性推断研究:通过对18个前沿LLM的实验证明,使用合成评论所得出的结论与使用真实数据时一致。综合来看,我们的实验结果、数据集及处理流程为未来面向隐私保护的研究奠定了坚实基础,有助于理解和缓解LLMs带来的基于推断的隐私威胁。