Despite recent advancements, NLP models continue to be vulnerable to bias. This bias often originates from the uneven distribution of real-world data and can propagate through the annotation process. Escalated integration of these models in our lives calls for methods to mitigate bias without overbearing annotation costs. While active learning (AL) has shown promise in training models with a small amount of annotated data, AL's reliance on the model's behavior for selective sampling can lead to an accumulation of unwanted bias rather than bias mitigation. However, infusing clustering with AL can overcome the bias issue of both AL and traditional annotation methods while exploiting AL's annotation efficiency. In this paper, we propose a novel adaptive clustering-based active learning algorithm, D-CALM, that dynamically adjusts clustering and annotation efforts in response to an estimated classifier error-rate. Experiments on eight datasets for a diverse set of text classification tasks, including emotion, hatespeech, dialog act, and book type detection, demonstrate that our proposed algorithm significantly outperforms baseline AL approaches with both pretrained transformers and traditional Support Vector Machines. D-CALM showcases robustness against different measures of information gain and, as evident from our analysis of label and error distribution, can significantly reduce unwanted model bias.
翻译:摘要:尽管近年来取得了显著进展,自然语言处理模型仍易受偏见影响。这种偏见通常源于现实世界数据的不均匀分布,并可能通过标注过程传播。随着这些模型在我们生活中的深度融合,亟需在不过度增加标注成本的前提下缓解偏见。尽管主动学习在利用少量标注数据训练模型方面展现出潜力,但其依赖模型行为进行选择性采样的方式可能导致不良偏见的积累,而非缓解。然而,将聚类与主动学习相结合,既能克服主动学习和传统标注方法中的偏见问题,又能利用主动学习的标注效率。本文提出了一种新颖的自适应聚类主动学习算法D-CALM,该算法根据分类器估计的错误率动态调整聚类与标注努力。针对情感分析、仇恨言论检测、对话行为识别以及书籍类型检测等八种不同文本分类任务的实验表明,我们提出的算法在使用预训练Transformer和传统支持向量机时均显著优于基线主动学习方法。D-CALM展现了对不同信息增益度量的鲁棒性,并且从标签与错误分布的分析中可见,该算法能有效减少不必要的模型偏见。