We empirically demonstrate that a transformer pre-trained on country-scale unlabeled human mobility data learns embeddings capable, through fine-tuning, of developing a deep understanding of the target geography and its corresponding mobility patterns. Utilizing an adaptation framework, we evaluate the performance of our pre-trained embeddings in encapsulating a broad spectrum of concepts directly and indirectly related to human mobility. This includes basic notions, such as geographic location and distance, and extends to more complex constructs, such as administrative divisions and land cover. Our extensive empirical analysis reveals a substantial performance boost gained from pre-training, reaching up to 38% in tasks such as tree-cover regression. We attribute this result to the ability of the pre-training to uncover meaningful patterns hidden in the raw data, beneficial for modeling relevant high-level concepts. The pre-trained embeddings emerge as robust representations of regions and trajectories, potentially valuable for a wide range of downstream applications.
翻译:我们通过实证研究表明,在国家规模无标签人类移动数据上预训练的Transformer模型,通过微调能够学习到具备对目标地理及其对应移动模式深度理解能力的嵌入表示。采用适应框架,我们评估了预训练嵌入在封装与人类移动直接或间接相关的广泛概念方面的性能。这包括基本概念,如地理位置和距离,并延伸至更复杂的结构,如行政区划和土地覆盖。我们广泛的实证分析揭示了预训练带来的显著性能提升,在树木覆盖回归等任务中达到38%的增益。我们将此结果归因于预训练能够揭示原始数据中隐藏的有意义模式,这些模式有助于建模相关的高级概念。预训练嵌入作为区域和轨迹的鲁棒表示,对广泛的下游应用具有潜在价值。