Due to the correlational structure in our traits such as identities, cultures, and political attitudes, seemingly innocuous preferences like following a band or using a specific slang can reveal private traits. This possibility, especially when combined with massive, public social data and advanced computational methods, poses a fundamental privacy risk. As our data exposure online and the rapid advancement of AI are increasing the risk of misuse, it is critical to understand the capacity of large language models (LLMs) to exploit such potential. Here, using online discussions on DebateOrg and Reddit, we show that LLMs can reliably infer hidden political alignment, significantly outperforming traditional machine learning models. Prediction accuracy further improves as we aggregate multiple text-level inferences into a user-level prediction, and as we use more politics-adjacent domains. We demonstrate that LLMs leverage words that are highly predictive of political alignment while not being explicitly political. Our findings underscore the capacity and risks of LLMs for exploiting socio-cultural correlates.
翻译:由于身份、文化及政治态度等个人特质之间存在关联性结构,看似无害的偏好(如关注某乐队或使用特定俚语)可能泄露私人特质。这种可能性——尤其是与海量公开社交数据及先进计算方法相结合时——构成了根本性的隐私风险。随着人们在网络上的数据暴露程度不断增加以及人工智能的快速发展,数据滥用的风险日益加剧,因此理解大语言模型(LLMs)利用此类潜在关联的能力至关重要。本研究基于DebateOrg和Reddit的在线讨论数据,证明大语言模型能够可靠地推断隐藏的政治立场,其表现显著优于传统机器学习模型。当我们将多个文本级推断聚合为用户级预测,以及使用更多政治相关领域的数据时,预测准确率会进一步提升。我们证明大语言模型能够利用那些虽非明确涉及政治但高度预示政治立场的词汇。这些发现凸显了大语言模型在利用社会文化关联性方面所具有的能力与风险。