We empirically demonstrate that a transformer pre-trained on country-scale unlabeled human mobility data learns embeddings capable, through fine-tuning, of developing a deep understanding of the target geography and its corresponding mobility patterns. Utilizing an adaptation framework, we evaluate the performance of our pre-trained embeddings in encapsulating a broad spectrum of concepts directly and indirectly related to human mobility. This includes basic notions, such as geographic location and distance, and extends to more complex constructs, such as administrative divisions and land cover. Our extensive empirical analysis reveals a substantial performance boost gained from pre-training, reaching up to 38% in tasks such as tree-cover regression. We attribute this result to the ability of the pre-training to uncover meaningful patterns hidden in the raw data, beneficial for modeling relevant high-level concepts. The pre-trained embeddings emerge as robust representations of regions and trajectories, potentially valuable for a wide range of downstream applications.
翻译:我们通过实证表明,一个在国家级未标注人类移动数据上预训练的Transformer,通过微调能够学习到能够深入理解目标地理及其相应移动模式的嵌入表示。利用自适应框架,我们评估了预训练嵌入在直接或间接与人类移动相关的广泛概念上的封装能力,涵盖地理位置和距离等基本概念,以及行政区划和土地覆盖等更复杂的概念。大量实证分析显示,预训练带来了显著的性能提升,在树冠覆盖率回归等任务上最高可达38%。我们将这一结果归因于预训练能够挖掘原始数据中隐藏的有意义模式,从而有助于对相关高层概念进行建模。预训练嵌入作为区域和轨迹的鲁棒表征,有望广泛应用于各类下游任务。