Learning effective representations of urban environments requires capturing spatial structure beyond fixed administrative boundaries. Existing geospatial representation learning approaches typically aggregate Points of Interest (POIs) into pre-defined administrative regions such as census units or ZIP code areas, assigning a single embedding to each region. However, POIs often form semantically meaningful groups that extend across, within, or beyond these boundaries, defining places that better reflect human activity and urban function. To address this limitation, we propose PlaceRep, a geospatial representation learning method that constructs place-level representations by clustering spatially and semantically related POIs. PlaceRep summarizes large-scale POI graphs from U.S. Foursquare data to produce general-purpose urban region embeddings while automatically identifying places across multiple spatial scales. By eliminating model pre-training, PlaceRep provides a scalable and efficient solution for multi-granular geospatial analysis. Experiments using the tasks of population density estimation and housing price prediction as downstream tasks show that PlaceRep outperforms most state-of-the-art graph-based geospatial representation learning methods and achieves up to a x100 speedup in generating region-level representations on large-scale POI graphs. The implementation of PlaceRep is available at https://github.com/mohammadhashemii/PlaceRep.
翻译:学习城市环境的有效表征需要捕捉超越固定行政边界的地理空间结构。现有地理空间表征学习方法通常将兴趣点(POI)聚合到预先定义的行政区域(如普查单元或邮政编码区),为每个区域分配单一嵌入向量。然而,兴趣点常形成跨越、包含或超出这些边界的语义相关群体,这些群体定义了更切合人类活动与城市功能的场所。为解决这一局限,我们提出PlaceRep——一种通过聚类空间与语义相关兴趣点来构建场所级表征的地理空间表征学习方法。PlaceRep融合美国Foursquare数据中大规模兴趣点图,生成通用型城市区域嵌入向量,同时自动识别跨多个空间尺度的场所。通过省略模型预训练,PlaceRep为多粒度地理空间分析提供了可扩展且高效的解决方案。以人口密度估计与房价预测任务作为下游任务的实验表明,PlaceRep在性能上超越多数基于图的最先进地理空间表征学习方法,并在大规模兴趣点图上实现高达百倍的区域级表征生成速度提升。PlaceRep的实现代码已开源在https://github.com/mohammadhashemii/PlaceRep。