Large Language Models (LLMs) inherently carry the biases contained in their training corpora, which can lead to the perpetuation of societal harm. As the impact of these foundation models grows, understanding and evaluating their biases becomes crucial to achieving fairness and accuracy. We propose to study what LLMs know about the world we live in through the lens of geography. This approach is particularly powerful as there is ground truth for the numerous aspects of human life that are meaningfully projected onto geographic space such as culture, race, language, politics, and religion. We show various problematic geographic biases, which we define as systemic errors in geospatial predictions. Initially, we demonstrate that LLMs are capable of making accurate zero-shot geospatial predictions in the form of ratings that show strong monotonic correlation with ground truth (Spearman's $\rho$ of up to 0.89). We then show that LLMs exhibit common biases across a range of objective and subjective topics. In particular, LLMs are clearly biased against locations with lower socioeconomic conditions (e.g. most of Africa) on a variety of sensitive subjective topics such as attractiveness, morality, and intelligence (Spearman's $\rho$ of up to 0.70). Finally, we introduce a bias score to quantify this and find that there is significant variation in the magnitude of bias across existing LLMs. Code is available on the project website: https://rohinmanvi.github.io/GeoLLM
翻译:大型语言模型(LLMs)本质上承载着其训练语料库中包含的偏见,这可能导致社会危害的持续存在。随着这些基础模型影响力的增长,理解和评估其偏见对于实现公平性和准确性变得至关重要。我们提出通过地理视角来研究LLMs对我们所处世界的认知。这种方法特别有效,因为人类生活中许多有意义地投射到地理空间的方面(如文化、种族、语言、政治和宗教)都存在客观事实依据。我们揭示了各种有问题的地理偏见,并将其定义为地理空间预测中的系统性误差。首先,我们证明LLMs能够以评分形式进行准确的零样本地理空间预测,这些评分与客观事实表现出强烈的单调相关性(斯皮尔曼ρ系数最高达0.89)。随后我们证明LLMs在一系列客观和主观主题上表现出普遍偏见。特别是,LLMs在多个敏感主观话题(如吸引力、道德水平和智力)上明显对 socioeconomic 条件较差的地区(例如非洲大部分区域)存在偏见(斯皮尔曼ρ系数最高达0.70)。最后,我们引入量化这种偏见的偏差分数,发现现有LLMs的偏见程度存在显著差异。项目网站提供代码:https://rohinmanvi.github.io/GeoLLM