Latent Generative Modeling of Random Fields from Limited Training Data

The ability to accurately model random fields plays a critical role in science and engineering for problems involving uncertain, spatially-varying quantities such as heterogeneous material properties and turbulent flows. Deep generative models offer a powerful tool for sampling high- or infinite-dimensional uncertainties like random fields, but their reliance on large, dense training datasets limits their applicability in contexts where sufficient data is difficult or expensive to obtain. In this work, we propose a latent-space approach to generative modeling of random fields that incorporates domain knowledge to supplement limited training data. A constraint-aware variational autoencoder (VAE) with a function decoder is first used to learn compact latent representations of continuous functions that adhere to known physical or statistical constraints, even when training data is sparse or indirect. Generative modeling is then performed in the learned latent space, decoupling constraint enforcement from the sampling process. This decoupling enables expressive multi-step generative methods to be deployed in data-limited settings where existing constrained multi-step approaches are not directly applicable. The richer latent distributions captured by the generative model also overcome limitations of standard VAEs, which rely on simple parametric priors and struggle to represent complex, multimodal, or heavy-tailed distributions over functions. Efficacy is demonstrated on two challenging applications: wind velocity field reconstruction from sparse sensors and material property inference from indirect measurements. Results show the effectiveness of incorporating domain knowledge constraints for data-limited problems and the improved sample quality and robustness of the latent generative modeling approach versus directly sampling a constrained VAE.

翻译：准确建模随机场的能力在科学与工程中至关重要，涉及诸如非均匀材料属性和湍流等不确定、空间变化量的问题。深度生成模型为采样高维或无限维不确定性（如随机场）提供了强大工具，但其对大规模密集训练数据集的依赖限制了其在数据难以获取或获取成本高昂场景中的适用性。本文提出一种融入领域知识的潜在空间随机场生成建模方法，以补充有限的训练数据。首先采用带函数解码器的约束感知变分自编码器（VAE），学习符合已知物理或统计约束的连续函数紧凑潜在表征，即使训练数据稀疏或间接。随后在所学潜在空间中进行生成建模，将约束执行与采样过程解耦。这种解耦使得在数据受限场景中可部署表达性强的多步生成方法，而现有约束多步方法在此类场景中无法直接应用。生成模型捕获的更丰富潜在分布也克服了标准VAE的局限——后者依赖简单参数化先验，难以表示函数上的复杂多模态或重尾分布。通过两个具有挑战性的应用验证有效性：基于稀疏传感器的风场重建与间接测量的材料属性推断。结果表明，融入领域知识约束对解决数据受限问题具有成效，且潜在生成建模方法相比直接采样约束VAE能提升样本质量与鲁棒性。