Textual geographic information is indispensable and heavily relied upon in practical applications. The absence of clear distribution poses challenges in effectively harnessing geographic information, thereby driving our quest for exploration. We contend that geographic information is influenced by human behavior, cognition, expression, and thought processes, and given our intuitive understanding of natural systems, we hypothesize its conformity to the Gamma distribution. Through rigorous experiments on a diverse range of 24 datasets encompassing different languages and types, we have substantiated this hypothesis, unearthing the underlying regularities governing the dimensions of quantity, length, and distance in geographic information. Furthermore, theoretical analyses and comparisons with Gaussian distributions and Zipf's law have refuted the contingency of these laws. Significantly, we have estimated the upper bounds of human utilization of geographic information, pointing towards the existence of uncharted territories. Also, we provide guidance in geographic information extraction. Hope we peer its true countenance uncovering the veil of geographic information.
翻译:文本地理信息在实际应用中不可或缺且被广泛依赖。地理信息分布的模糊性阻碍了对其有效利用,从而驱动我们展开探索。我们认为,地理信息受人类行为、认知、表达及思维过程的影响,基于对自然系统的直觉理解,我们假设其服从伽马分布。通过在涵盖不同语言和类型的24个数据集上开展严格实验,我们验证了这一假设,揭示了地理信息在数量、长度和距离维度上的内在规律。此外,理论分析以及与高斯分布和齐普夫定律的对比进一步排除了这些规律的偶然性。重要的是,我们估算了人类对地理信息利用的上限,暗示存在未知领域。同时,我们为地理信息抽取提供了指导。期望我们能够揭开地理信息的面纱,一窥其真实面貌。