Variational Encoder-Decoders for Learning Latent Representations of Physical Systems

We present a deep-learning Variational Encoder-Decoder (VED) framework for learning data-driven low-dimensional representations of the relationship between high-dimensional parameters of a physical system and the system's high-dimensional observable response. The framework consists of two deep learning-based probabilistic transformations: An encoder mapping parameters to latent codes and a decoder mapping latent codes to the observable response. The hyperparameters of these transformations are identified by maximizing a variational lower bound on the log-conditional distribution of the observable response given parameters. To promote the disentanglement of latent codes, we equip this variational loss with a penalty on the off-diagonal entries of the aggregate distribution covariance of codes. This regularization penalty encourages the pushforward of a standard Gaussian distribution of latent codes to approximate the marginal distribution of the observable response. Using the proposed framework we successfully model the hydraulic pressure response at observation wells of a groundwater flow model as a function of its discrete log-hydraulic transmissivity field. Compared to the canonical correlation analysis encoding, the VED model achieves a lower-dimensional latent representation, with as low as $r = 50$ latent dimensions without a significant loss of reconstruction accuracy. We explore the impact of regularization on model performance, finding that KL-divergence and covariance regularization improve feature disentanglement in latent space while maintaining reconstruction accuracy. Furthermore, we evaluate the generative capabilities of the regularized model by decoding random Gaussian noise, revealing that tuning both $\beta$ and $\lambda$ parameters enhances the quality of the generated observable response data.

翻译：本文提出一种基于深度学习的变分编码器-解码器框架，用于学习物理系统高维参数与系统高维可观测响应之间关系的数据驱动低维表征。该框架包含两个基于深度学习的概率变换：将参数映射到潜在编码的编码器，以及将潜在编码映射到可观测响应的解码器。这些变换的超参数通过最大化给定参数条件下可观测响应对数条件分布的变分下界来确定。为促进潜在编码的解纠缠，我们在变分损失函数中增加了对编码聚合分布协方差非对角元素的惩罚项。该正则化惩罚促使潜在编码标准高斯分布的前推近似于可观测响应的边缘分布。应用所提框架，我们成功将地下水流动模型中观测井的水压响应建模为其离散对数水力传导率场的函数。与典型相关分析编码相比，VED模型实现了更低维的潜在表征，在$r = 50$潜在维度下仍能保持重建精度无显著损失。我们探究了正则化对模型性能的影响，发现KL散度与协方差正则化在保持重建精度的同时，能提升潜在空间中的特征解纠缠能力。此外，我们通过解码随机高斯噪声评估了正则化模型的生成能力，结果表明同时调整$\beta$和$\lambda$参数可有效提升生成可观测响应数据的质量。