Active learning (AL) is a sequential learning scheme aiming to select the most informative data. AL reduces data consumption and avoids the cost of labeling large amounts of data. However, AL trains the model and solves an acquisition optimization for each selection. It becomes expensive when the model training or acquisition optimization is challenging. In this paper, we focus on active nonparametric function learning, where the gold standard Gaussian process (GP) approaches suffer from cubic time complexity. We propose an amortized AL method, where new data are suggested by a neural network which is trained up-front without any real data (Figure 1). Our method avoids repeated model training and requires no acquisition optimization during the AL deployment. We (i) utilize GPs as function priors to construct an AL simulator, (ii) train an AL policy that can zero-shot generalize from simulation to real learning problems of nonparametric functions and (iii) achieve real-time data selection and comparable learning performances to time-consuming baseline methods.
翻译:主动学习(AL)是一种旨在选择最具信息量数据的顺序学习方案。AL减少了数据消耗,并避免了标注大量数据的成本。然而,AL需要为每次选择训练模型并求解一个获取优化问题。当模型训练或获取优化具有挑战性时,这会变得非常昂贵。在本文中,我们专注于主动非参数函数学习,其中黄金标准的高斯过程(GP)方法受到立方时间复杂度的困扰。我们提出了一种摊销式AL方法,其中新数据由一个神经网络建议,该神经网络在没有任何真实数据的情况下预先训练(图1)。我们的方法避免了重复的模型训练,并且在AL部署期间不需要进行获取优化。我们(i)利用GP作为函数先验来构建一个AL模拟器,(ii)训练一个能够从模拟零样本泛化到真实非参数函数学习问题的AL策略,以及(iii)实现了实时数据选择,并取得了与耗时的基线方法相当的学习性能。