We consider the problem of active learning in the context of spatial sampling for level set estimation (LSE), where the goal is to localize all regions where a function of interest lies above/below a given threshold as quickly as possible. We present a finite-horizon search procedure to perform LSE in one dimension while optimally balancing both the final estimation error and the distance traveled for a fixed number of samples. A tuning parameter is used to trade off between the estimation accuracy and distance traveled. We show that the resulting optimization problem can be solved in closed form and that the resulting policy generalizes existing approaches to this problem. We then show how this approach can be used to perform level set estimation in higher dimensions under the popular Gaussian process model. Empirical results on synthetic data indicate that as the cost of travel increases, our method's ability to treat distance nonmyopically allows it to significantly improve on the state of the art. On real air quality data, our approach achieves roughly one fifth the estimation error at less than half the cost of competing algorithms.
翻译:我们考虑在空间采样背景下进行水平集估计的主动学习问题,其目标是尽快定位所有函数值高于/低于给定阈值的区域。我们提出了一种一维空间中进行水平集估计的有限时域搜索策略,该策略在固定样本数量下,能最优地平衡最终估计误差与移动距离。通过调节参数可在估计精度与移动距离之间进行权衡。我们证明该优化问题具有闭式解,且所得策略统一了现有方法。进一步,我们展示了如何将该方法应用于高斯过程模型下的高维水平集估计。合成数据实验表明:随着移动成本增加,我们的方法因能非短视地处理距离问题,显著优于现有技术。在真实空气质量数据上,本方法以不到竞争算法一半的成本,实现了约五分之一的估计误差。