We study how persona prompting shapes language generated by multimodal large language models in an urban perception setting. Using 59,808 annotations from 1,200 persona-conditioned agents and two no-persona settings, we analyze captions, justifications, and perception tags across personas. Results indicate strong convergence in captions for different personas, whereas justifications display systematic variation associated with socioeconomic and political attributes, while perception tags show no statistically significant persona-related differences, though effect trends are observed. Topic analysis further reveals that personas emphasize different evaluative themes when interpreting the same scenes.
翻译:我们研究了角色提示如何塑造多模态大型语言模型在城市感知场景中生成的语言。通过分析来自1,200个角色条件化代理和两种无角色设置的59,808条注释,我们跨角色比较了标题、理由和感知标签。结果表明,不同角色的标题表现出高度趋同,而理由则显示出与社会经济和政治属性相关的系统性差异;尽管观察到了效果趋势,但感知标签未出现统计学显著的角色相关差异。主题分析进一步揭示,角色在解释相同场景时强调了不同的评价性主题。