Urban region profiling is pivotal for smart cities, but mining fine-grained semantics from noisy and incomplete urban data remains challenging. In response, we propose a novel self-supervised graph collaborative filtering model for urban region embedding called EUPAS. Specifically, region heterogeneous graphs containing human mobility data, point of interests (POIs) information, and geographic neighborhood details for each region are fed into the model, which generates region embeddings that preserve intra-region and inter-region dependencies through GCNs and multi-head attention. Meanwhile, we introduce spatial perturbation augmentation to generate positive samples that are semantically similar and spatially close to the anchor, preparing for subsequent contrastive learning. Furthermore, adversarial training is employed to construct an effective pretext task by generating strong positive pairs and mining hard negative pairs for the region embeddings. Finally, we jointly optimize supervised and self-supervised learning to encourage the model to capture the high-level semantics of region embeddings while ignoring the noisy and unimportant details. Extensive experiments on real-world datasets demonstrate the superiority of our model over state-of-the-art methods.
翻译:城市区域画像对于智能城市建设至关重要,但从噪声和残缺的城市数据中挖掘细粒度语义仍具挑战性。为此,我们提出一种名为EUPAS的新型自监督图协同过滤模型用于城市区域嵌入。具体而言,将包含人类移动数据、兴趣点(POIs)信息和各区域地理邻域细节的区域异构图输入模型,通过图卷积网络(GCNs)和多头注意力机制生成保持区域内与区域间依赖关系的区域嵌入。同时,我们引入空间扰动增强方法生成与锚点在语义相似性和空间邻近性上一致的正样本,为后续对比学习做好准备。进一步地,通过对抗训练构建有效的前置任务,为区域嵌入生成强正样本对并挖掘困难负样本对。最后,通过联合优化监督学习与自监督学习,使模型在忽略噪声和非重要细节的同时,捕捉区域嵌入的高层语义。在真实数据集上的大量实验表明,本模型性能优于当前最优方法。