Attention-based Modeling of Physical Systems: Improved Latent Representations

We propose attention-based modeling of quantities at arbitrary spatial points conditioned on related measurements at different locations. Our approach adapts a transformer-encoder to process measurements and read-out positions together. Attention-based models exhibit excellent performance across domains, which makes them an interesting candidate for modeling data irregularly sampled in space. We introduce a novel encoding strategy that applies the same transformation to the measurements and read-out positions, after which they are combined with encoded measurement values instead of relying on two different mappings. Efficiently learning input-output mappings from irregularly-spaced data is a fundamental challenge in modeling physical phenomena. To evaluate the effectiveness of our model, we conduct experiments on diverse problem domains, including high-altitude wind nowcasting, two-days weather forecasting, fluid dynamics, and heat diffusion. Our attention-based model consistently outperforms state-of-the-art models, such as Graph Element Networks and Conditional Neural Processes, for modeling irregularly sampled data. Notably, our model reduces root mean square error (RMSE) for wind nowcasting, improving from 9.24 to 7.98 and for a heat diffusion task from .126 to .084. We hypothesize that this superior performance can be attributed to the enhanced flexibility of our latent representation and the improved data encoding technique. To support our hypothesis, we design a synthetic experiment that reveals excessive bottlenecking in the latent representations of alternative models, which hinders information utilization and impedes training.

翻译：我们提出了一种基于注意力的建模方法，用于在任意空间点基于不同位置的关联测量值进行量估计。我们的方法采用变压器编码器（Transformer-Encoder）同时处理测量值和读出位置。注意力模型在各领域均展现出优异性能，这使其成为建模空间不规则采样数据的理想候选方案。我们提出一种新型编码策略，对测量值和读出位置应用相同的变换，随后将其与编码后的测量值进行融合，而非依赖两种不同的映射函数。从非均匀采样数据中高效学习输入-输出映射是物理现象建模的基础挑战。为评估模型有效性，我们在多个问题领域开展实验，包括高空风临近预报、两天天气预报、流体动力学及热扩散。我们的注意力模型在建模不规则采样数据时始终优于图元素网络（Graph Element Networks）和条件神经过程（Conditional Neural Processes）等最先进模型。值得注意的是，我们的模型将风临近预报的均方根误差（RMSE）从9.24降至7.98，热扩散任务的RMSE从0.126降至0.084。我们假设这种优越性能可归因于潜表征的增强灵活性及改进的数据编码技术。为验证该假设，我们设计了合成实验，揭示了替代模型潜表征中存在的过度瓶颈化现象，该现象阻碍了信息利用并妨碍了训练过程。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

48+阅读 · 2019年12月19日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日