Language models have long been shown to embed geographical information in their hidden representations. This line of work has recently been revisited by extending this result to Large Language Models (LLMs). In this paper, we propose to fill the gap between well-established and recent literature by observing how geographical knowledge evolves when scaling language models. We show that geographical knowledge is observable even for tiny models, and that it scales consistently as we increase the model size. Notably, we observe that larger language models cannot mitigate the geographical bias that is inherent to the training data.
翻译:语言模型已被证实能够在其隐层表示中嵌入地理信息。近期研究通过将这一结论扩展至大语言模型(LLMs)而重新审视了这一研究方向。本文旨在通过观察语言模型规模扩展时地理知识的演化规律,填补成熟文献与最新研究之间的空白。研究表明,即便在极小型模型中也能观测到地理知识,且随着模型规模增大,这种知识呈现一致的扩展趋势。值得注意的是,我们观察到更大规模的语言模型无法缓解训练数据中固有的地理偏差。