BITS for GAPS: Bayesian Information-Theoretic Sampling for hierarchical GAussian Process Surrogates

We introduce Bayesian Information-Theoretic Sampling for hierarchical GAussian Process Surrogates (BITS for GAPS), a framework enabling information-theoretic experimental design of Gaussian process-based surrogate models. Unlike standard methods, which use fixed or point-estimated hyperparameters in acquisition functions, our approach propagates hyperparameter uncertainty into the sampling criterion through Bayesian hierarchical modeling. In this framework, a latent function receives a Gaussian process prior, while hyperparameters are assigned additional priors to capture the modeler's knowledge of the governing physical phenomena. Consequently, the acquisition function incorporates uncertainties from both the latent function and its hyperparameters, ensuring that sampling is guided by both data scarcity and model uncertainty. We further establish theoretical results in this context: a closed-form approximation and a lower bound of the posterior differential entropy. We demonstrate the framework's utility for hybrid modeling with a vapor-liquid equilibrium case study. Specifically, we build a surrogate model for latent activity coefficients in a binary mixture. We construct a hybrid model by embedding the surrogate into an extended form of Raoult's law. This hybrid model then informs distillation design. This case study shows how partial physical knowledge can be translated into a hierarchical Gaussian process surrogate. It also shows that using BITS for GAPS increases expected information gain and predictive accuracy by targeting high-uncertainty regions of the Wilson activity model. Overall, BITS for GAPS is a generalized uncertainty-aware framework for adaptive data acquisition in complex physical systems.

翻译：我们提出了基于贝叶斯信息论采样的分层高斯过程替代模型（BITS for GAPS），该框架能够实现高斯过程替代模型的信息论实验设计。与标准方法在采集函数中使用固定或点估计超参数不同，我们的方法通过贝叶斯分层建模将超参数不确定性传播至采样准则。在该框架中，潜在函数服从高斯过程先验，而超参数则被赋予额外先验以捕捉建模者对物理现象的知识。因此，采集函数同时融合了潜在函数及其超参数的不确定性，确保采样同时受数据稀缺性和模型不确定性驱动。我们进一步建立了该语境下的理论结果：后验微分熵的闭式近似及下界。通过汽液平衡案例研究，我们展示了该框架在混合建模中的实用性。具体而言，我们为二元混合物中的潜在活度系数构建了替代模型，通过将替代模型嵌入扩展形式的拉乌尔定律中建立混合模型，进而指导精馏设计。该案例表明，部分物理知识可转化为分层高斯过程替代模型，同时验证BITS for GAPS通过聚焦Wilson活度模型的高不确定性区域，能够提升期望信息增益与预测精度。总体而言，BITS for GAPS是一种适用于复杂物理系统自适应数据采集的通用不确定性感知框架。