Black-box zero-th order optimization is a central primitive for applications in fields as diverse as finance, physics, and engineering. In a common formulation of this problem, a designer sequentially attempts candidate solutions, receiving noisy feedback on the value of each attempt from the system. In this paper, we study scenarios in which feedback is also provided on the safety of the attempted solution, and the optimizer is constrained to limit the number of unsafe solutions that are tried throughout the optimization process. Focusing on methods based on Bayesian optimization (BO), prior art has introduced an optimization scheme -- referred to as SAFEOPT -- that is guaranteed not to select any unsafe solution with a controllable probability over feedback noise as long as strict assumptions on the safety constraint function are met. In this paper, a novel BO-based approach is introduced that satisfies safety requirements irrespective of properties of the constraint function. This strong theoretical guarantee is obtained at the cost of allowing for an arbitrary, controllable but non-zero, rate of violation of the safety constraint. The proposed method, referred to as SAFE-BOCP, builds on online conformal prediction (CP) and is specialized to the cases in which feedback on the safety constraint is either noiseless or noisy. Experimental results on synthetic and real-world data validate the advantages and flexibility of the proposed SAFE-BOCP.
翻译:黑盒零阶优化是金融、物理学和工程等众多领域应用中的核心基础。在该问题的常见形式中,设计者依次尝试候选解,并从系统中接收关于每次尝试值的含噪声反馈。本文研究在尝试解的安全性也提供反馈的场景,且优化器需在优化过程中限制所尝试的不安全解的数量。针对基于贝叶斯优化(BO)的方法,现有技术引入了一种优化方案(称为SAFEOPT),该方案在满足安全性约束函数的严格假设前提下,能以可控概率确保不选择任何不安全解(受反馈噪声影响)。本文提出一种新颖的基于贝叶斯优化的方法,该方法无需依赖约束函数的性质即可满足安全性要求。这一强理论保证的代价是允许存在任意可控但非零的安全约束违反率。所提出的方法称为SAFE-BOCP,它基于在线共形预测(CP),并专门针对安全性约束反馈为无噪声或有噪声的情况进行设计。在合成数据和真实数据上的实验结果验证了所提出的SAFE-BOCP方法的优势与灵活性。