Robotics and automation offer massive accelerations for solving intractable, multivariate scientific problems such as materials discovery, but the available search spaces can be dauntingly large. Bayesian optimization (BO) has emerged as a popular sample-efficient optimization engine, thriving in tasks where no analytic form of the target function/property is known. Here we exploit expert human knowledge in the form of hypotheses to direct Bayesian searches more quickly to promising regions of chemical space. Previous methods have used underlying distributions derived from existing experimental measurements, which is unfeasible for new, unexplored scientific tasks. Also, such distributions cannot capture intricate hypotheses. Our proposed method, which we call HypBO, uses expert human hypotheses to generate an improved seed of samples. Unpromising seeds are automatically discounted, while promising seeds are used to augment the surrogate model data, thus achieving better-informed sampling. This process continues in a global versus local search fashion, organized in a bilevel optimization framework. We validate the performance of our method on a range of synthetic functions and demonstrate its practical utility on a real chemical design task where the use of expert hypotheses accelerates the search performance significantly.
翻译:机器人技术与自动化为实现材料发现等棘手的多变量科学问题提供了巨大加速,但可用的搜索空间往往庞大得令人望而生畏。贝叶斯优化作为一种流行的样本高效优化引擎,在目标函数/性质无解析形式的任务中表现出色。本文利用专家假设形式的人类知识,更快速地将贝叶斯搜索引导至化学空间中的有前景区域。以往方法使用基于现有实验测量结果的底层分布,但这对于新的、未探索的科学任务并不可行。此外,这类分布无法捕捉复杂的假设。我们提出的方法称为HypBO,利用人类专家假设生成改进的种子样本。无前景的种子被自动筛除,而有前景的种子则用于增强代理模型数据,从而实现更充分的信息采样。该过程以全局与局部搜索交替的方式进行,组织成双层优化框架。我们在多种合成函数上验证了方法的性能,并在实际化学设计任务中展示了其实用价值——在该任务中,使用专家假设显著加速了搜索性能。