Gaussian processes (GPs) are generally regarded as the gold standard surrogate model for emulating computationally expensive computer-based simulators. However, the problem of training GPs as accurately as possible with a minimum number of model evaluations remains challenging. We address this problem by suggesting a novel adaptive sampling criterion called VIGF (variance of improvement for global fit). The improvement function at any point is a measure of the deviation of the GP emulator from the nearest observed model output. At each iteration of the proposed algorithm, a new run is performed where VIGF is the largest. Then, the new sample is added to the design and the emulator is updated accordingly. A batch version of VIGF is also proposed which can save the user time when parallel computing is available. Additionally, VIGF is extended to the multi-fidelity case where the expensive high-fidelity model is predicted with the assistance of a lower fidelity simulator. This is performed via hierarchical kriging. The applicability of our method is assessed on a bunch of test functions and its performance is compared with several sequential sampling strategies. The results suggest that our method has a superior performance in predicting the benchmark functions in most cases. An implementation of VIGF is available in the dgpsi R package, which can be found on CRAN.
翻译:高斯过程通常被视为仿真计算密集型计算机模拟器的黄金标准代理模型。然而,以最少的模型评估次数尽可能准确地训练高斯过程仍然具有挑战性。我们通过提出一种称为VIGF(全局拟合改进方差)的新型自适应采样准则来解决这一问题。任意点处的改进函数用于衡量高斯过程仿真器与最近观测模型输出之间的偏差。在所提算法的每次迭代中,将在VIGF值最大的位置执行新运行。随后,新样本被加入设计并相应更新仿真器。本文还提出了VIGF的批量版本,可在并行计算可用时为用户节省时间。此外,VIGF被扩展至多保真度场景,其中昂贵的高保真度模型通过低保真度模拟器的辅助进行预测,这是通过分层克里金方法实现的。我们在多个测试函数上评估了该方法的适用性,并将其性能与多种序贯采样策略进行比较。结果表明,在多数情况下我们的方法在预测基准函数方面具有优越性能。VIGF的实现可在CRAN上的dgpsi R包中获取。