Large language models (LLMs) are increasingly used to describe, evaluate and interpret places, yet it remains unclear whether they do so from a culturally neutral standpoint. Here we test urban perception in frontier LLMs using a balanced global street-view sample and prompts that either remain neutral or invoke different regional cultural standpoints. Across open-ended descriptions and structured place judgments, the neutral condition proved not to be neutral in practice. Prompts associated with Europe and Northern America remained systematically closer to the baseline than many non-Western prompts, indicating that model perception is organized around a culturally uneven reference frame rather than a universal one. Cultural prompting also shifted affective evaluation, producing sentiment-based ingroup preference for some prompted identities. Comparisons with regional human text-image benchmarks showed that culturally proximate prompting could improve alignment with human descriptions, but it did not recover human levels of semantic diversity and often preserved an affectively elevated style. The same asymmetry reappeared in structured judgments of safety, beauty, wealth, liveliness, boredom and depression, where model outputs were interpretable but only partly reproduced human group differences. These findings suggest that LLMs do not simply perceive cities from nowhere: they do so through a culturally uneven baseline that shapes what appears ordinary, familiar and positively valued.
翻译:大语言模型(LLMs)正越来越多地被用于描述、评估和解读地点,然而,它们是否从文化中立的立场出发尚不明确。本文利用平衡的全球街景样本和或保持中立或引用不同区域文化立场的提示词,测试了前沿LLMs的城市感知能力。在开放式描述和结构化场景判断中,中立条件在实践上并非中立。与欧洲和北美洲相关的提示词在系统性上更接近许多非西方提示词的基线,这表明模型感知围绕一个文化不平等的参考框架而非通用框架组织。文化提示还改变了情感评估,对某些提示身份产生了基于情感的群体内偏好。与区域人类文本-图像基准的比较表明,文化接近提示可提高与人类描述的对齐程度,但无法恢复人类水平的语义多样性,且常保留情感提升的风格。同样的不对称性出现在对安全性、美感、富裕度、活力、无聊感和抑郁感的结构化判断中,模型输出虽可解释,但仅部分重现了人类群体差异。这些发现表明,LLMs并非简单地凭空感知城市:它们通过一个文化不平等的基准进行感知,这一基准塑造了什么是平凡、熟悉且被积极评价的。