We demonstrate that embeddings derived from large language models, when processed with "Survey and Questionnaire Item Embeddings Differentials" (SQuID), can recover the structure of human values obtained from human rater judgments on the Revised Portrait Value Questionnaire (PVQ-RR). We compare multiple embedding models across a number of evaluation metrics including internal consistency, dimension correlations and multidimensional scaling configurations. Unlike previous approaches, SQuID addresses the challenge of obtaining negative correlations between dimensions without requiring domain-specific fine-tuning or training data re-annotation. Quantitative analysis reveals that our embedding-based approach explains 55% of variance in dimension-dimension similarities compared to human data. Multidimensional scaling configurations show alignment with pooled human data from 49 different countries. Generalizability tests across three personality inventories (IPIP, BFI-2, HEXACO) demonstrate that SQuID consistently increases correlation ranges, suggesting applicability beyond value theory. These results show that semantic embeddings can effectively replicate psychometric structures previously established through extensive human surveys. The approach offers substantial advantages in cost, scalability and flexibility while maintaining comparable quality to traditional methods. Our findings have significant implications for psychometrics and social science research, providing a complementary methodology that could expand the scope of human behavior and experience represented in measurement tools.
翻译:我们证明,通过"调查与问卷项目嵌入差异"(SQuID)方法处理的大型语言模型嵌入,能够恢复基于人类评分者对修订版肖像价值问卷(PVQ-RR)判断所得的人类价值结构。我们通过内部一致性、维度相关性和多维尺度配置等多种评估指标比较了多个嵌入模型。与先前方法不同,SQuID解决了维度间负相关获取的挑战,且无需领域特定的微调或训练数据重新标注。定量分析表明,相较于人类数据,我们的嵌入方法能解释维度间相似性55%的方差。多维尺度配置显示与来自49个不同国家的人类汇总数据具有一致性。在三个人格量表(IPIP、BFI-2、HEXACO)上的泛化性测试表明,SQuID能持续提升相关性范围,暗示其应用可超越价值理论范畴。这些结果表明语义嵌入能有效复现通过大规模人类调查建立的心理测量结构。该方法在成本、可扩展性和灵活性方面具有显著优势,同时保持与传统方法相当的质量。我们的发现对心理测量学和社会科学研究具有重要启示,提供了一种补充性方法论,有望扩展测量工具所涵盖的人类行为与经验范围。