An autonomous experimentation platform in manufacturing is supposedly capable of conducting a sequential search for finding suitable manufacturing conditions by itself or even for discovering new materials with minimal human intervention. The core of the intelligent control of such platforms is a policy to decide where to conduct the next experiment based on what has been done thus far. Such policy inevitably trades off between exploitation and exploration. Currently, the prevailing approach is to use various acquisition functions in the Bayesian optimization framework. We discuss whether it is beneficial to trade off exploitation versus exploration by measuring the element and degree of surprise associated with the immediate past observation. We devise a surprise-reacting policy using two existing surprise metrics, known as the Shannon surprise and Bayesian surprise. Our analysis shows that the surprise-reacting policy appears to be better suited for quickly characterizing the overall landscape of a response surface under resource constraints. We do not claim that we have a fully autonomous experimentation system but believe that the surprise-reacting capability benefits the automation of sequential decisions in autonomous experimentation.
翻译:制造业中的自主实验平台应能自主进行序贯搜索以寻找合适的制造条件,甚至在最小人为干预下发现新材料。此类平台智能控制的核心在于一种策略,即根据已完成实验决定下一步实验位置。该策略不可避免地需要在利用与探索之间进行权衡。当前主流方法是在贝叶斯优化框架中使用多种采集函数。我们探讨了通过衡量近期观测结果所包含的意外元素及其程度来权衡利用与探索是否更具优势。我们使用两种现有意外度量指标——香农意外与贝叶斯意外——设计了一种意外反应策略。分析表明,在资源受限条件下,意外反应策略能更快速地刻画响应曲面的整体形貌。我们并非宣称已构建完整的自主实验系统,但相信意外反应能力将有益于自主实验中序贯决策的自动化。