Human data labeling is an important and expensive task at the heart of supervised learning systems. Hierarchies help humans understand and organize concepts. We ask whether and how concept hierarchies can inform the design of annotation interfaces to improve labeling quality and efficiency. We study this question through annotation of vaccine misinformation, where the labeling task is difficult and highly subjective. We investigate 6 user interface designs for crowdsourcing hierarchical labels by collecting over 18,000 individual annotations. Under a fixed budget, integrating hierarchies into the design improves crowdsource workers' F1 scores. We attribute this to (1) Grouping similar concepts, improving F1 scores by +0.16 over random groupings, (2) Strong relative performance on high-difficulty examples (relative F1 score difference of +0.40), and (3) Filtering out obvious negatives, increasing precision by +0.07. Ultimately, labeling schemes integrating the hierarchy outperform those that do not - achieving mean F1 of 0.70.
翻译:人类数据标注是监督学习系统中一项重要且昂贵的核心任务。层次结构有助于人类理解和组织概念。本文探讨概念层次结构是否以及如何指导标注界面设计,以提升标注质量与效率。我们通过疫苗错误信息标注这一困难且高度主观的任务开展研究。我们设计了6种用于众包层次化标签的用户界面,并收集了超过18,000条独立标注。在固定预算下,将层次结构融入设计能够提升众包工人的F1分数。我们将此归因于:(1)对相似概念进行分组,相较于随机分组使F1分数提升0.16;(2)在高难度样本上表现强劲(相对F1分数差异为+0.40);(3)过滤明显负例,使精确率提升0.07。最终,融入层次结构的标注方案优于未采用层次结构的方案——平均F1分数达到0.70。