As Large Language Models (LLMs) demonstrate increasingly human-like abilities in various natural language processing (NLP) tasks that are bound to become integral to personalized technologies, understanding their capabilities and inherent biases is crucial. Our study investigates the potential of LLMs like ChatGPT to infer psychological dispositions of individuals from their digital footprints. Specifically, we assess the ability of GPT-3.5 and GPT-4 to derive the Big Five personality traits from users' Facebook status updates in a zero-shot learning scenario. Our results show an average correlation of r = .29 (range = [.22, .33]) between LLM-inferred and self-reported trait scores. Furthermore, our findings suggest biases in personality inferences with regard to gender and age: inferred scores demonstrated smaller errors for women and younger individuals on several traits, suggesting a potential systematic bias stemming from the underlying training data or differences in online self-expression.
翻译:随着大型语言模型(LLMs)在各类与个性化技术深度融合的自然语言处理(NLP)任务中展现出日益接近人类的能力,理解其能力范围与固有偏差至关重要。本研究探讨了ChatGPT等LLM从个体数字足迹推断其心理倾向的潜力。具体而言,我们在零样本学习场景下评估了GPT-3.5和GPT-4基于用户Facebook状态更新推导大五人格特质的效能。结果显示,LLM推断的特质得分与自我报告得分之间的平均相关系数为r = .29(范围 = [.22, .33])。此外,研究揭示人格推断中存在与性别和年龄相关的偏差:针对若干人格特质,对女性和年轻个体的推断得分误差较小,表明可能存在源于底层训练数据或在线自我表达差异的系统性偏差。