Large Language Models (LLMs) inherently carry the biases contained in their training corpora, which can lead to the perpetuation of societal harm. As the impact of these foundation models grows, understanding and evaluating their biases becomes crucial to achieving fairness and accuracy. We propose to study what LLMs know about the world we live in through the lens of geography. This approach is particularly powerful as there is ground truth for the numerous aspects of human life that are meaningfully projected onto geographic space such as culture, race, language, politics, and religion. We show various problematic geographic biases, which we define as systemic errors in geospatial predictions. Initially, we demonstrate that LLMs are capable of making accurate zero-shot geospatial predictions in the form of ratings that show strong monotonic correlation with ground truth (Spearman's $\rho$ of up to 0.89). We then show that LLMs exhibit common biases across a range of objective and subjective topics. In particular, LLMs are clearly biased against locations with lower socioeconomic conditions (e.g. most of Africa) on a variety of sensitive subjective topics such as attractiveness, morality, and intelligence (Spearman's $\rho$ of up to 0.70). Finally, we introduce a bias score to quantify this and find that there is significant variation in the magnitude of bias across existing LLMs.
翻译:大型语言模型(LLMs)天然携带其训练语料库中的偏见,这可能导致社会危害的持续存在。随着这些基础模型影响力的增长,理解和评估其偏见对于实现公平性和准确性至关重要。我们提议通过地理视角研究LLMs对所处世界的认知。这一方法尤为有力,因为人类生活中众多有意义的方面(如文化、种族、语言、政治和宗教)均可投影到地理空间,并存在真实基准。我们展示了多种有问题的地理偏见,将其定义为地理空间预测中的系统性错误。首先,我们证明LLMs能够以评分形式进行准确的零样本地理空间预测,其与真实基准之间呈现强单调相关性(Spearman's ρ 最高达0.89)。接着,我们表明LLMs在一系列客观和主观主题上表现出常见偏见。特别是在吸引力、道德和智力等敏感主观主题上,LLMs明显歧视社会经济条件较低的地区(如非洲大部分地区)(Spearman's ρ 最高达0.70)。最后,我们引入一个量化偏见的评分指标,并发现现有LLMs的偏见严重程度存在显著差异。