Learning effective representations of urban environments requires capturing spatial structure beyond fixed administrative boundaries. Existing geospatial representation learning approaches typically aggregate Points of Interest(POI) into pre-defined administrative regions such as census units or ZIP code areas, assigning a single embedding to each region. However, POIs often form semantically meaningful groups that extend across, within, or beyond these boundaries, defining places that better reflect human activity and urban function. To address this limitation, we propose PlaceRep, a training-free geospatial representation learning method that constructs place-level representations by clustering spatially and semantically related POIs. PlaceRep summarizes large-scale POI graphs from U.S. Foursquare data to produce general-purpose urban region embeddings while automatically identifying places across multiple spatial scales. By eliminating model pre-training, PlaceRep provides a scalable and efficient solution for multi-granular geospatial analysis. Experiments using the tasks of population density estimation and housing price prediction as downstream tasks show that PlaceRep outperforms most state-of-the-art graph-based geospatial representation learning methods and achieves up to a 100x speedup in generating region-level representations on large-scale POI graphs. The implementation of PlaceRep is available at https://github.com/mohammadhashemii/PlaceRep.
翻译:学习城市环境的有效表征需要捕捉超越固定行政边界的空间结构。现有的地理空间表征学习方法通常将兴趣点聚合到预定义的行政区域(如人口普查单元或邮政编码区域),为每个区域分配单一嵌入向量。然而,兴趣点常形成具有语义意义的群组,这些群组可能跨越、位于内部或超出这些行政边界,从而定义了更能反映人类活动与城市功能的场所。为克服这一局限,我们提出PlaceRep——一种无需训练的地理空间表征学习方法,该方法通过聚类空间与语义相关的兴趣点来构建场所级表征。PlaceRep基于美国Foursquare数据的大规模兴趣点图进行归纳,生成通用型城市区域嵌入向量,同时自动识别多空间尺度的场所。通过消除模型预训练环节,PlaceRep为多粒度地理空间分析提供了可扩展的高效解决方案。以人口密度估计和房价预测作为下游任务的实验表明,PlaceRep在大多数基于图的地理空间表征学习方法中表现优异,并在大规模兴趣点图上生成区域级表征时实现了高达100倍的加速。PlaceRep的实现代码已发布于https://github.com/mohammadhashemii/PlaceRep。