In crowdsourcing, quality control is commonly achieved by having workers examine items and vote on their correctness. To minimize the impact of unreliable worker responses, a $\delta$-margin voting process is utilized, where additional votes are solicited until a predetermined threshold $\delta$ for agreement between workers is exceeded. The process is widely adopted but only as a heuristic. Our research presents a modeling approach using absorbing Markov chains to analyze the characteristics of this voting process that matter in crowdsourced processes. We provide closed-form equations for the quality of resulting consensus vote, the expected number of votes required for consensus, the variance of vote requirements, and other distribution moments. Our findings demonstrate how the threshold $\delta$ can be adjusted to achieve quality equivalence across voting processes that employ workers with varying accuracy levels. We also provide efficiency-equalizing payment rates for voting processes with different expected response accuracy levels. Additionally, our model considers items with varying degrees of difficulty and uncertainty about the difficulty of each example. Our simulations, using real-world crowdsourced vote data, validate the effectiveness of our theoretical model in characterizing the consensus aggregation process. The results of our study can be effectively employed in practical crowdsourcing applications.
翻译:在众包中,质量控制通常通过让工人检查项目并对其正确性进行投票来实现。为降低不可靠工人响应的影响,采用$\delta$-边际投票过程,即持续征求额外投票,直至工人间的协议超过预定阈值$\delta$。该过程被广泛采用,但仅作为启发式方法。我们的研究提出了一种基于吸收马尔可夫链的建模方法,以分析该投票过程中对众包流程至关重要的特性。我们提供了关于最终共识投票质量、达成共识所需预期投票数、投票数方差及其他分布矩的闭式方程。研究结果表明,可通过调整阈值$\delta$,使采用不同准确率水平工人的投票过程实现质量等价。我们还针对不同期望响应准确率水平的投票过程,提出了效率均衡的支付费率。此外,模型考虑了难度各异且示例难度不确定性的条目。基于真实众包投票数据的仿真验证了我们的理论模型在描述共识聚合过程方面的有效性。研究结果可有效应用于实际众包应用场景。