Human data labeling is an important and expensive task at the heart of supervised learning systems. Hierarchies help humans understand and organize concepts. We ask whether and how concept hierarchies can inform the design of annotation interfaces to improve labeling quality and efficiency. We study this question through annotation of vaccine misinformation, where the labeling task is difficult and highly subjective. We investigate 6 user interface designs for crowdsourcing hierarchical labels by collecting over 18,000 individual annotations. Under a fixed budget, integrating hierarchies into the design improves crowdsource workers' F1 scores. We attribute this to (1) Grouping similar concepts, improving F1 scores by +0.16 over random groupings, (2) Strong relative performance on high-difficulty examples (relative F1 score difference of +0.40), and (3) Filtering out obvious negatives, increasing precision by +0.07. Ultimately, labeling schemes integrating the hierarchy outperform those that do not - achieving mean F1 of 0.70.
翻译:人工数据标注是监督学习系统中一项重要且昂贵的任务。层级结构有助于人类理解和组织概念。我们探究概念层级结构是否以及如何能够指导标注界面的设计,以提升标注质量和效率。我们通过疫苗错误信息的标注来研究这一问题,其中标注任务既困难又高度主观。我们通过收集超过18,000条个体标注,研究了6种用于众包分层标签的用户界面设计。在固定预算下,将层级结构整合到设计中能够提高众包工人的F1分数。我们将此归因于:(1) 将相似概念分组,相比随机分组,F1分数提升+0.16;(2) 在高难度示例上表现相对较强(相对F1分数差异为+0.40);(3) 过滤掉明显负例,精确率提升+0.07。最终,整合了层级结构的标注方案优于未整合的方案——实现了平均F1分数0.70。