Real engineering and scientific applications often involve one or more qualitative inputs. Standard Gaussian processes (GPs), however, cannot directly accommodate qualitative inputs. The recently introduced latent variable Gaussian process (LVGP) overcomes this issue by first mapping each qualitative factor to underlying latent variables (LVs), and then uses any standard GP covariance function over these LVs. The LVs are estimated similarly to the other GP hyperparameters through maximum likelihood estimation, and then plugged into the prediction expressions. However, this plug-in approach will not account for uncertainty in estimation of the LVs, which can be significant especially with limited training data. In this work, we develop a fully Bayesian approach for the LVGP model and for visualizing the effects of the qualitative inputs via their LVs. We also develop approximations for scaling up LVGPs and fully Bayesian inference for the LVGP hyperparameters. We conduct numerical studies comparing plug-in inference against fully Bayesian inference over a few engineering models and material design applications. In contrast to previous studies on standard GP modeling that have largely concluded that a fully Bayesian treatment offers limited improvements, our results show that for LVGP modeling it offers significant improvements in prediction accuracy and uncertainty quantification over the plug-in approach.
翻译:实际工程和科学应用常涉及一个或多个定性输入变量。然而,标准高斯过程(Gaussian Process, GP)无法直接处理定性输入。近期提出的隐变量高斯过程(Latent Variable Gaussian Process, LVGP)通过将每个定性因子映射至底层隐变量(Latent Variables, LVs),再对这些隐变量采用标准GP协方差函数,从而解决了该问题。隐变量的估计与其他GP超参数类似,通过最大似然估计获得后直接代入预测表达式。但此类嵌入方法无法考虑隐变量估计中的不确定性——在训练数据有限时,该不确定性尤为显著。本研究针对LVGP模型开发了全贝叶斯方法,并利用隐变量可视化定性输入的影响。我们还提出了用于扩展LVGP规模的近似方法,以及针对LVGP超参数的全贝叶斯推断近似方案。通过多项工程模型与材料设计应用的数值研究,我们对比了嵌入推断与全贝叶斯推断的性能。与既有标准GP建模研究中“全贝叶斯处理带来的改进有限”的结论不同,本研究发现:对于LVGP建模,全贝叶斯方法在预测精度与不确定性量化方面显著优于嵌入方法。