$GRU^{spa}$: Gated Recurrent Unit with Spatial Attention for Spatio-Temporal Disaggregation

Open data is frequently released spatially aggregated, usually to comply with privacy policies. But coarse, heterogeneous aggregations complicate learning and integration for downstream AI/ML systems. In this work, we consider models to disaggregate spatio-temporal data from a low-resolution, irregular partition (e.g., census tract) to a high-resolution, irregular partition (e.g., city block). We propose a model, Gated Recurrent Unit with Spatial Attention ($GRU^{spa}$), where spatial attention layers are integrated into the original Gated Recurrent Unit (GRU) model. The spatial attention layers capture spatial interactions among regions, while the gated recurrent module captures the temporal dependencies. Additionally, we utilize containment relationships between different geographic levels (e.g., when a given city block is wholly contained in a given census tract) to constrain the spatial attention layers. For situations where limited historical training data is available, we study transfer learning scenarios and show that a model pre-trained on one city variable can be fine-tuned for another city variable using only a few hundred samples. Evaluating these techniques on two mobility datasets, we find that $GRU^{spa}$ provides a significant improvement over other neural models as well as typical heuristic methods, allowing us to synthesize realistic point data over small regions useful for training downstream models.

翻译：开放数据通常以空间聚合形式发布，这主要是为了遵循隐私政策。但粗粒度、异构的聚合方式会增加下游AI/ML系统的学习与集成难度。本研究提出了一种模型——基于空间注意力的门控循环单元（$GRU^{spa}$），用于将低分辨率、不规则划分（如人口普查区）的时空数据分解至高分辨率、不规则划分（如城市街区）。该模型将空间注意力层集成到原始门控循环单元（GRU）中：空间注意力层捕捉区域间的空间交互，而门控循环模块则捕获时间依赖关系。此外，我们利用不同地理层级间的包含关系（例如，当给定城市街区完全包含在给定的普查区内时）对空间注意力层进行约束。针对历史训练数据有限的情况，我们研究了迁移学习场景，并证明：在一个城市变量上预训练的模型，仅需数百个样本即可微调适配至另一个城市变量。通过在两个移动数据集上评估这些技术，我们发现$GRU^{spa}$相较于其他神经模型及典型启发式方法具有显著提升，可合成小区域内的真实点数据，用于训练下游模型。

相关内容

门控循环单元

关注 1

门控递归单元（GRU）是递归神经网络的门控机制，由Kyunghyun Cho等人在2014年提出。GRU就像带有忘记门的长短期记忆（LSTM），但由于缺少输出门，因此参数比LSTM少。GRU在某些较小和频率较低的数据集上表现出更好的性能。GRU在复音音乐建模，语音信号建模和自然语言处理的某些任务上的性能类似于LSTM 。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日