Crowdsourcing provides a flexible approach for leveraging human intelligence to solve large-scale problems, gaining widespread acceptance in domains like intelligent information processing, social decision-making, and crowd ideation. However, the uncertainty of participants significantly compromises the answer quality, sparking substantial research interest. Existing surveys predominantly concentrate on quality control in Boolean tasks, which are generally formulated as simple label classification, ranking, or numerical prediction. Ubiquitous open-ended tasks like question-answering, translation, and semantic segmentation have not been sufficiently discussed. These tasks usually have large to infinite answer spaces and non-unique acceptable answers, posing significant challenges for quality assurance. This survey focuses on quality control methods applicable to open-ended tasks in crowdsourcing. We propose a two-tiered framework to categorize related works. The first tier introduces a holistic view of the quality model, encompassing key aspects like task, worker, answer, and system. The second tier refines the classification into more detailed categories, including quality dimensions, evaluation metrics, and design decisions, providing insights into the internal structures of the quality control framework in each aspect. We thoroughly investigate how these quality control methods are implemented in state-of-the-art works and discuss key challenges and potential future research directions.
翻译:众包为利用人类智能解决大规模问题提供了一种灵活方法,在智能信息处理、社会决策和群体创意等领域获得了广泛认可。然而,参与者的不确定性显著影响了答案质量,引发了大量研究关注。现有综述主要集中于布尔任务的质量控制,这类任务通常被表述为简单的标签分类、排序或数值预测。而普遍存在的开放式任务(如问答、翻译和语义分割)尚未得到充分讨论。这些任务通常具有巨大甚至无限的答案空间,且可接受的答案不唯一,这给质量保证带来了重大挑战。本综述重点关注适用于众包中开放式任务的质量控制方法。我们提出了一个双层框架对相关工作进行分类:第一层从任务、工作者、答案和系统等关键维度引入质量模型的整体视图;第二层将分类细化为更具体的类别,包括质量维度、评估指标和设计决策,从而揭示每个方面在质量控制框架中的内部结构。我们深入研究了这些质量控制方法在最新工作中如何实现,并讨论了关键挑战及未来潜在的研究方向。