Annotating data for supervised learning can be costly. When the annotation budget is limited, active learning can be used to select and annotate those observations that are likely to give the most gain in model performance. We propose an active learning algorithm that, in addition to selecting which observation to annotate, selects the precision of the annotation that is acquired. Assuming that annotations with low precision are cheaper to obtain, this allows the model to explore a larger part of the input space, with the same annotation budget. We build our acquisition function on the previously proposed BALD objective for Gaussian Processes, and empirically demonstrate the gains of being able to adjust the annotation precision in the active learning loop.
翻译:为监督学习标注数据成本高昂。当标注预算有限时,主动学习可用于选择并标注那些最可能显著提升模型性能的观测样本。我们提出一种主动学习算法,该算法不仅能选择标注哪些观测样本,还能选择所获取标注的精度。假设低精度标注成本更低,这使得模型在相同标注预算下能够探索输入空间的更大区域。我们基于先前针对高斯过程提出的BALD目标构建采集函数,并通过实证验证了在主动学习循环中动态调整标注精度所带来的增益。