Active learning can improve the efficiency of training prediction models by identifying the most informative new labels to acquire. However, non-response to label requests can impact active learning's effectiveness in real-world contexts. We conceptualise this degradation by considering the type of non-response present in the data, demonstrating that biased non-response is particularly detrimental to model performance. We argue that this sort of non-response is particularly likely in contexts where the labelling process, by nature, relies on user interactions. To mitigate the impact of biased non-response, we propose a cost-based correction to the sampling strategy--the Upper Confidence Bound of the Expected Utility (UCB-EU)--that can, plausibly, be applied to any active learning algorithm. Through experiments, we demonstrate that our method successfully reduces the harm from labelling non-response in many settings. However, we also characterise settings where the non-response bias in the annotations remains detrimental under UCB-EU for particular sampling methods and data generating processes. Finally, we evaluate our method on a real-world dataset from e-commerce platform Taobao. We show that UCB-EU yields substantial performance improvements to conversion models that are trained on clicked impressions. Most generally, this research serves to both better conceptualise the interplay between types of non-response and model improvements via active learning, and to provide a practical, easy to implement correction that helps mitigate model degradation.
翻译:主动学习能够通过识别最具信息量的新标签来提升预测模型的训练效率。然而,在现实场景中,对标签请求的非响应会削弱主动学习的效果。我们通过考虑数据中存在的非响应类型来概念化这种退化过程,证明偏向性非响应对模型性能的损害尤为显著。我们认为,在标注过程本质上依赖用户交互的场景中,这类非响应尤其容易出现。为缓解偏向性非响应的影响,我们提出了一种基于代价的采样策略修正方法——期望效用的上置信界(UCB-EU)——该方法理论上可适用于任何主动学习算法。实验表明,该方法能在多种场景下有效降低标签非响应带来的损害。但我们也证实,在某些采样方法和数据生成过程中,UCB-EU仍无法完全消除标注偏向性非响应的负面影响。最后,我们在电商平台淘宝的真实数据集上评估了该方法。结果显示,UCB-EU能够显著提升基于点击曝光训练转化模型的性能。总体而言,本研究不仅深化了对非响应类型与主动学习模型改进之间相互作用机制的理解,还提供了一种实用、易部署的修正方法,有助于缓解模型性能退化。