Active learning optimizes the exploration of large parameter spaces by strategically selecting which experiments or simulations to conduct, thus reducing resource consumption and potentially accelerating scientific discovery. A key component of this approach is a probabilistic surrogate model, typically a Gaussian Process (GP), which approximates an unknown functional relationship between control parameters and a target property. However, conventional GPs often struggle when applied to systems with discontinuities and non-stationarities, prompting the exploration of alternative models. This limitation becomes particularly relevant in physical science problems, which are often characterized by abrupt transitions between different system states and rapid changes in physical property behavior. Fully Bayesian Neural Networks (FBNNs) serve as a promising substitute, treating all neural network weights probabilistically and leveraging advanced Markov Chain Monte Carlo techniques for direct sampling from the posterior distribution. This approach enables FBNNs to provide reliable predictive distributions, crucial for making informed decisions under uncertainty in the active learning setting. Although traditionally considered too computationally expensive for 'big data' applications, many physical sciences problems involve small amounts of data in relatively low-dimensional parameter spaces. Here, we assess the suitability and performance of FBNNs with the No-U-Turn Sampler for active learning tasks in the 'small data' regime, highlighting their potential to enhance predictive accuracy and reliability on test functions relevant to problems in physical sciences.
翻译:主动学习通过策略性地选择实验或仿真任务,优化对大参数空间的探索过程,从而降低资源消耗并加速科学发现。该方法的核心组件是概率代理模型——通常采用高斯过程(GP)——用于近似控制参数与目标属性间的未知函数关系。然而,当面对具有不连续性与非平稳性的系统时,传统高斯过程往往表现不佳,这促使研究者探索替代模型。该限制在物理科学问题中尤为突出,此类问题常呈现不同系统状态间的突变转换以及物理属性行为的剧烈变化。全贝叶斯神经网络(FBNN)作为有前景的替代方案,将神经网络所有权重参数概率化,并利用先进马尔可夫链蒙特卡洛技术直接从后验分布采样。该方法使FBNN能够提供可靠的预测分布,这对主动学习框架下基于不确定性进行决策至关重要。尽管传统上被认为对“大数据”应用计算成本过高,但许多物理科学问题涉及低维参数空间中的小样本数据。在此,我们评估了结合无U型采样器的FBNN在“小数据”场景下执行主动学习任务的适用性与性能,突出其在物理科学相关测试函数上提升预测精度与可靠性的潜力。