BEACON: A Bayesian Optimization Strategy for Novelty Search in Expensive Black-Box Systems

Novelty search (NS) refers to a class of exploration algorithms that automatically uncover diverse system behaviors through simulations or experiments. Systematically obtaining diverse outcomes is a key component in many real-world design problems such as material and drug discovery, neural architecture search, reinforcement learning, and robot navigation. Since the relationship between the inputs and outputs (i.e., behaviors) of these complex systems is typically not available in closed form, NS requires a black-box perspective. Consequently, popular NS algorithms rely on evolutionary optimization and other meta-heuristics that require intensive sampling of the input space, which is impractical when the system is expensive to evaluate. We propose a Bayesian optimization inspired algorithm for sample-efficient NS that is specifically designed for such expensive black-box systems. Our approach models the input-to-behavior mapping with multi-output Gaussian processes (MOGP) and selects the next point to evaluate by maximizing a novelty metric that depends on a posterior sample drawn from the MOGP that promotes both exploration and exploitation. By leveraging advances in efficient posterior sampling and high-dimensional Gaussian process modeling, we discuss how our approach can be made scalable with respect to both amount of data and number of inputs. We test our approach on ten synthetic benchmark problems and eight real-world problems (with up to 2133 inputs) including new applications such as discovery of diverse metal organic frameworks for use in clean energy technology. We show that our approach greatly outperforms existing NS algorithms by finding substantially larger sets of diverse behaviors under limited sample budgets.

翻译：新颖性搜索（NS）是指一类通过仿真或实验自动发现多样化系统行为的探索算法。在许多现实世界设计问题（如材料与药物发现、神经架构搜索、强化学习和机器人导航）中，系统性地获取多样化结果是关键环节。由于这些复杂系统的输入与输出（即行为）关系通常无法以解析形式获得，NS需采用黑盒视角。因此，主流NS算法依赖进化优化及其他元启发式方法，需要对输入空间进行密集采样，这在系统评估成本高昂时并不现实。本文提出一种受贝叶斯优化启发的样本高效NS算法，专为这类昂贵的黑盒系统设计。我们的方法采用多输出高斯过程（MOGP）对输入-行为映射进行建模，并通过最大化新颖性指标来选择待评估的下一个点——该指标依赖于从MOGP中抽取的后验样本，兼顾探索与利用。通过结合高效后验采样与高维高斯过程建模的最新进展，我们阐述了如何使该方法在数据量和输入维度上均具备可扩展性。我们在十个合成基准问题和八个现实问题（最高达2133维输入）上测试了该方法，包括清洁能源技术中多样化金属有机框架发现等新应用。实验表明，在有限样本预算下，我们的方法通过发现显著更丰富的多样化行为集合，大幅优于现有NS算法。