Measuring subjective constructs in naturally occurring social media text requires annotation procedures that are theoretically grounded, empirically validated, and transferable to an encoder model for scalable prediction. Using non-English social media posts annotated according to Schwartz's theory of basic human values, we investigate how different LLMs, prompts, and instruction languages operationalize the expression of values in text. We argue that although texts may permit multiple plausible interpretations, theory-based value definitions can constrain interpretations and reduce spurious value attributions. Beyond precision, recall, and F1, we evaluate structural alignment between values, error structure, confidence-ambiguity relations, and annotation stability. We show that different LLMs produce different value interpretations. Iterative prompt calibration through error analysis reduces misattributions and improves alignment with expert annotations. We also derive targeted expert verification rules from recurrent error structures and use them during corpus annotation. Finally, we show that LLM annotations can be transferred to an encoder model through soft-label training, retaining theory-based value interpretations and information about uncertainty in value expression.
翻译:对自然出现的社交媒体文本进行主观构念的测量,需要采用在理论上具有基础、实证上得到验证,并能迁移至编码器模型以实现可扩展预测的标注流程。本研究基于施瓦茨基本人类价值观理论,对非英语社交媒体帖文进行标注,探究不同大语言模型、提示语和指令语言如何将价值观表达操作化为文本特征。我们论证,尽管文本可能允许多种合理解读,但基于理论的价值观定义能够约束解读范围,减少虚假价值归因。除精确率、召回率和F1分数外,我们进一步评估了价值观间的结构对齐度、误差结构、置信度-模糊性关系及标注稳定性。研究表明,不同大语言模型会产生不同的价值观解读。通过基于误差分析的迭代提示校准可减少误归因,提升与专家标注的一致性。我们还从重复出现的误差结构中推导出针对性专家验证规则,并将其应用于语料标注过程。最后,我们证明大语言模型标注可通过软标签训练迁移至编码器模型,从而保留基于理论的价值观解读及价值观表达不确定性的相关信息。