BOIS: Bayesian Optimization of Interconnected Systems

Bayesian optimization (BO) has proven to be an effective paradigm for the global optimization of expensive-to-sample systems. One of the main advantages of BO is its use of Gaussian processes (GPs) to characterize model uncertainty which can be leveraged to guide the learning and search process. However, BO typically treats systems as black-boxes and this limits the ability to exploit structural knowledge (e.g., physics and sparse interconnections). Composite functions of the form $f(x, y(x))$, wherein GP modeling is shifted from the performance function $f$ to an intermediate function $y$, offer an avenue for exploiting structural knowledge. However, the use of composite functions in a BO framework is complicated by the need to generate a probability density for $f$ from the Gaussian density of $y$ calculated by the GP (e.g., when $f$ is nonlinear it is not possible to obtain a closed-form expression). Previous work has handled this issue using sampling techniques; these are easy to implement and flexible but are computationally intensive. In this work, we introduce a new paradigm which allows for the efficient use of composite functions in BO; this uses adaptive linearizations of $f$ to obtain closed-form expressions for the statistical moments of the composite function. We show that this simple approach (which we call BOIS) enables the exploitation of structural knowledge, such as that arising in interconnected systems as well as systems that embed multiple GP models and combinations of physics and GP models. Using a chemical process optimization case study, we benchmark the effectiveness of BOIS against standard BO and sampling approaches. Our results indicate that BOIS achieves performance gains and accurately captures the statistics of composite functions.

翻译：贝叶斯优化已被证明是对昂贵采样系统进行全局优化的有效范式。其主要优势之一在于利用高斯过程表征模型不确定性，从而指导学习与搜索过程。然而，贝叶斯优化通常将系统视为黑箱，这限制了其利用结构知识（如物理规律和稀疏互连）的能力。形如 $f(x, y(x))$ 的复合函数通过将高斯过程建模从性能函数 $f$ 转移至中间函数 $y$，为利用结构知识提供了途径。但在贝叶斯优化框架中使用复合函数时，需基于高斯过程计算的 $y$ 的高斯密度生成 $f$ 的概率密度（例如，当 $f$ 为非线性时无法获得闭合表达式），这使得问题复杂化。已有研究通过采样技术处理该问题，这类方法实现简便且灵活，但计算成本高昂。本文提出一种新范式，可在贝叶斯优化中高效使用复合函数：该方法通过对 $f$ 进行自适应线性化以获取复合函数统计矩的闭合表达式。研究表明，这种简单方法（称为BOIS）能够利用互联系统、嵌入多个高斯过程模型及物理模型与高斯过程组合系统等场景中的结构知识。通过化工过程优化案例研究，我们将BOIS与标准贝叶斯优化及采样方法进行了基准对比。结果表明，BOIS实现了性能提升，并准确捕获了复合函数的统计特性。