We propose attention-based modeling of quantities at arbitrary spatial points conditioned on related measurements at different locations. Our approach adapts a transformer-encoder to process measurements and read-out positions together. Attention-based models exhibit excellent performance across domains, which makes them an interesting candidate for modeling data irregularly sampled in space. We introduce a novel encoding strategy that applies the same transformation to the measurements and read-out positions, after which they are combined with encoded measurement values instead of relying on two different mappings. Efficiently learning input-output mappings from irregularly-spaced data is a fundamental challenge in modeling physical phenomena. To evaluate the effectiveness of our model, we conduct experiments on diverse problem domains, including high-altitude wind nowcasting, two-days weather forecasting, fluid dynamics, and heat diffusion. Our attention-based model consistently outperforms state-of-the-art models, such as Graph Element Networks and Conditional Neural Processes, for modeling irregularly sampled data. Notably, our model reduces root mean square error (RMSE) for wind nowcasting, improving from 9.24 to 7.98 and for a heat diffusion task from .126 to .084. We hypothesize that this superior performance can be attributed to the enhanced flexibility of our latent representation and the improved data encoding technique. To support our hypothesis, we design a synthetic experiment that reveals excessive bottlenecking in the latent representations of alternative models, which hinders information utilization and impedes training.
翻译:我们提出了一种基于注意力的建模方法,用于在任意空间点基于不同位置的关联测量值进行量估计。我们的方法采用变压器编码器(Transformer-Encoder)同时处理测量值和读出位置。注意力模型在各领域均展现出优异性能,这使其成为建模空间不规则采样数据的理想候选方案。我们提出一种新型编码策略,对测量值和读出位置应用相同的变换,随后将其与编码后的测量值进行融合,而非依赖两种不同的映射函数。从非均匀采样数据中高效学习输入-输出映射是物理现象建模的基础挑战。为评估模型有效性,我们在多个问题领域开展实验,包括高空风临近预报、两天天气预报、流体动力学及热扩散。我们的注意力模型在建模不规则采样数据时始终优于图元素网络(Graph Element Networks)和条件神经过程(Conditional Neural Processes)等最先进模型。值得注意的是,我们的模型将风临近预报的均方根误差(RMSE)从9.24降至7.98,热扩散任务的RMSE从0.126降至0.084。我们假设这种优越性能可归因于潜表征的增强灵活性及改进的数据编码技术。为验证该假设,我们设计了合成实验,揭示了替代模型潜表征中存在的过度瓶颈化现象,该现象阻碍了信息利用并妨碍了训练过程。