Active learning parallelization is widely used, but typically relies on fixing the batch size throughout experimentation. This fixed approach is inefficient because of a dynamic trade-off between cost and speed -- larger batches are more costly, smaller batches lead to slower wall-clock run-times -- and the trade-off may change over the run (larger batches are often preferable earlier). To address this trade-off, we propose a novel Probabilistic Numerics framework that adaptively changes batch sizes. By framing batch selection as a quadrature task, our integration-error-aware algorithm facilitates the automatic tuning of batch sizes to meet predefined quadrature precision objectives, akin to how typical optimizers terminate based on convergence thresholds. This approach obviates the necessity for exhaustive searches across all potential batch sizes. We also extend this to scenarios with constrained active learning and constrained optimization, interpreting constraint violations as reductions in the precision requirement, to subsequently adapt batch construction. Through extensive experiments, we demonstrate that our approach significantly enhances learning efficiency and flexibility in diverse Bayesian batch active learning and Bayesian optimization applications.
翻译:主动学习并行化被广泛应用,但通常依赖于在整个实验过程中固定批大小。这种固定方法效率低下,因为成本与速度之间存在动态权衡——较大的批更昂贵,较小的批导致更慢的挂钟运行时间——且这种权衡可能在运行过程中发生变化(较大的批通常在早期更可取)。为解决这一权衡,我们提出了一种新颖的概率数值框架,能够自适应地改变批大小。通过将批选择视为求积任务,我们的集成误差感知算法能够自动调整批大小,以满足预定义的求积精度目标,类似于典型优化器基于收敛阈值终止。该方法避免了在所有可能的批大小上进行穷举搜索的必要性。我们还将此方法扩展到受限主动学习和受限优化场景,将约束违规解释为精度要求的降低,从而后续调整批构建。通过大量实验,我们证明了我们的方法在多种贝叶斯批主动学习和贝叶斯优化应用中显著提升了学习效率和灵活性。