Solving complex classification tasks using deep neural networks typically requires large amounts of annotated data. However, corresponding class labels are noisy when provided by error-prone annotators, e.g., crowd workers. Training standard deep neural networks leads to subpar performances in such multi-annotator supervised learning settings. We address this issue by presenting a probabilistic training framework named multi-annotator deep learning (MaDL). A ground truth and an annotator performance model are jointly trained in an end-to-end learning approach. The ground truth model learns to predict instances' true class labels, while the annotator performance model infers probabilistic estimates of annotators' performances. A modular network architecture enables us to make varying assumptions regarding annotators' performances, e.g., an optional class or instance dependency. Further, we learn annotator embeddings to estimate annotators' densities within a latent space as proxies of their potentially correlated annotations. Together with a weighted loss function, we improve the learning from correlated annotation patterns. In a comprehensive evaluation, we examine three research questions about multi-annotator supervised learning. Our findings indicate MaDL's state-of-the-art performance and robustness against many correlated, spamming annotators.
翻译:使用深度神经网络解决复杂分类任务通常需要大量标注数据。然而,当由易出错的标注者(例如众包工作者)提供时,相应的类别标签会带有噪声。在这种多标注者监督学习场景中,训练标准深度神经网络会导致性能欠佳。我们通过提出一个名为多标注者深度学习(MaDL)的概率训练框架来解决这一问题。该框架通过端到端学习方法联合训练一个真实标签模型和一个标注者性能模型。真实标签模型学习预测实例的真实类别标签,而标注者性能模型推断标注者性能的概率估计。模块化的网络架构使我们能够对标注者性能做出不同假设,例如可选的类别或实例依赖性。此外,我们学习标注者嵌入以估计标注者在潜在空间中的密度,作为其可能相关标注的代理。结合加权的损失函数,我们改进了对相关标注模式的学习。在全面评估中,我们研究了关于多标注者监督学习的三个研究问题。我们的结果表明,MaDL具有最先进的性能,并且对许多相关及垃圾标注者具有鲁棒性。