Complex computer codes or models can often be run in a hierarchy of different levels of complexity ranging from the very basic to the sophisticated. The top levels in this hierarchy are typically expensive to run, which limits the number of possible runs. To make use of runs over all levels, and crucially improve predictions at the top level, we use multi-level Gaussian process emulators (GPs). The accuracy of the GP greatly depends on the design of the training points. In this paper, we present a multi-level adaptive sampling algorithm to sequentially increase the set of design points to optimally improve the fit of the GP. The normalised expected leave-one-out cross-validation error is calculated at all unobserved locations, and a new design point is chosen using expected improvement combined with a repulsion function. This criterion is calculated for each model level weighted by an associated cost for the code at that level. Hence, at each iteration, our algorithm optimises for both the new point location and the model level. The algorithm is extended to batch selection as well as single point selection, where batches can be designed for single levels or optimally across all levels.
翻译:复杂的计算机代码或模型通常可以在从基础到精密的多个复杂度层级上运行。该层级结构中的顶层通常运行成本高昂,这限制了可能的运行次数。为了利用所有层级的运行结果并显著改善顶层的预测性能,我们采用多层高斯过程模拟器(GPs)。高斯过程的精度很大程度上取决于训练点的设计。本文提出了一种多层自适应采样算法,用于逐步增加设计点集以优化高斯过程的拟合效果。在所有未观测位置计算归一化预期留一交叉验证误差,并通过预期改进与排斥函数相结合的方式选择新设计点。该准则针对每个模型层级进行计算,并根据该层级代码的相关成本进行加权。因此,在每次迭代中,我们的算法同时优化新点的位置和模型层级。该算法不仅支持单点选择,还可扩展至批量选择场景,其中批量可以针对单个层级设计,也可以跨所有层级进行优化设计。