Recent years brought advancements in using neural networks for representation learning of various language or visual phenomena. New methods freed data scientists from hand-crafting features for common tasks. Similarly, problems that require considering the spatial variable can benefit from pretrained map region representations instead of manually creating feature tables that one needs to prepare to solve a task. However, very few methods for map area representation exist, especially with respect to road network characteristics. In this paper, we propose a method for generating microregions' embeddings with respect to their road infrastructure characteristics. We base our representations on OpenStreetMap road networks in a selection of cities and use the H3 spatial index to allow reproducible and scalable representation learning. We obtained vector representations that detect how similar map hexagons are in the road networks they contain. Additionally, we observe that embeddings yield a latent space with meaningful arithmetic operations. Finally, clustering methods allowed us to draft a high-level typology of obtained representations. We are confident that this contribution will aid data scientists working on infrastructure-related prediction tasks with spatial variables.
翻译:近年来,神经网络在语言或视觉现象的表示学习方面取得了进展。新方法使数据科学家无需为常见任务手动构建特征。同样,需要考虑空间变量的任务也可以从预训练的地图区域表示中受益,而非手动创建解决任务所需的特征表。然而,现有地图区域表示方法极少,尤其是针对道路网络特征的方法。本文提出了一种基于道路基础设施特征生成微区域嵌入的方法。我们以部分城市的OpenStreetMap道路网络为基础,使用H3空间索引实现可重复且可扩展的表示学习。所获得的向量表示能够检测地图六边形在其包含的道路网络中的相似性。此外,我们观察到嵌入产生了具有有意义算术运算的潜在空间。最后,聚类方法使我们能够对所获得的表示进行高层次类型化。我们相信,这一贡献将有助于从事与空间变量相关的基础设施预测任务的数据科学家。