Common crowdsourcing systems average estimates of a latent quantity of interest provided by many crowdworkers to produce a group estimate. We develop a new approach -- just-predict-others -- that leverages self-supervised learning and a novel aggregation scheme. This approach adapts weights assigned to crowdworkers based on estimates they provided for previous quantities. When skills vary across crowdworkers or their estimates correlate, the weighted sum offers a more accurate group estimate than the average. Existing algorithms such as expectation maximization can, at least in principle, produce similarly accurate group estimates. However, their computational requirements become onerous when complex models, such as neural networks, are required to express relationships among crowdworkers. Just-predict-others accommodates such complexity as well as many other practical challenges. We analyze the efficacy of just-predict-others through theoretical and computational studies. Among other things, we establish asymptotic optimality as the number of engagements per crowdworker grows.
翻译:传统的众包系统通常将大量众包工作者对潜在感兴趣量的估计进行平均,以产生群体估计。我们开发了一种新方法——仅预测他人——该方法利用自监督学习及一种新颖的聚合方案。这种方法根据众包工作者先前对相关量的估计,动态调整分配给他们的权重。当不同众包工作者技能水平存在差异或他们的估计之间存在相关性时,加权和比简单平均能提供更准确的群体估计。现有的算法(如期望最大化)至少在理论上可以产生同样准确的群体估计。然而,当需要复杂模型(例如神经网络)来表达众包工作者之间的关系时,这些算法的计算需求会变得繁重。仅预测他人方法不仅能应对此类复杂性,还能处理许多其他实际挑战。我们通过理论分析与计算研究来评估仅预测他人方法的有效性。特别地,我们证明了当每位众包工作者参与任务数量增加时,该方法具有渐近最优性。