The Dawid-Skene model is the most widely assumed model in the analysis of crowdsourcing algorithms that estimate ground-truth labels from noisy worker responses. In this work, we are motivated by crowdsourcing applications where workers have distinct skill sets and their accuracy additionally depends on a task's type. While weighted majority vote (WMV) with a single weight vector for each worker achieves the optimal label estimation error in the Dawid-Skene model, we show that different weights for different types are necessary for a multi-type model. Focusing on the case where there are two types of tasks, we propose a spectral method to partition tasks into two groups that cluster tasks by type. Our analysis reveals that task types can be perfectly recovered if the number of workers $n$ scales logarithmically with the number of tasks $d$. Any algorithm designed for the Dawid-Skene model can then be applied independently to each type to infer the labels. Numerical experiments show how clustering tasks by type before estimating ground-truth labels enhances the performance of crowdsourcing algorithms in practical applications.
翻译:Dawid-Skene模型是分析众包算法时最广泛采用的模型,该算法旨在从带噪声的工人响应中估计真实标签。本研究受以下众包应用场景的启发:工人具备不同的技能组合,且其标注准确度还取决于任务类型。虽然在Dawid-Skene模型中,采用单一权重向量的加权多数投票(WMV)能达到最优的标签估计误差,但我们证明在多类型模型中需要为不同类型分配不同权重。针对任务包含两种类型的情况,我们提出一种谱方法将任务划分为按类型聚类的两组。分析表明,当工人数量$n$与任务数量$d$呈对数关系时,任务类型可被完全准确识别。随后,任何为Dawid-Skene模型设计的算法均可独立应用于各类型以推断标签。数值实验表明,在估计真实标签前对任务进行类型聚类,能有效提升众包算法在实际应用中的性能。