When groups of people are tasked with making a judgment, the issue of uncertainty often arises. Existing methods to reduce uncertainty typically focus on iteratively improving specificity in the overall task instruction. However, uncertainty can arise from multiple sources, such as ambiguity of the item being judged due to limited context, or disagreements among the participants due to different perspectives and an under-specified task. A one-size-fits-all intervention may be ineffective if it is not targeted to the right source of uncertainty. In this paper we introduce a new workflow, Judgment Sieve, to reduce uncertainty in tasks involving group judgment in a targeted manner. By utilizing measurements that separate different sources of uncertainty during an initial round of judgment elicitation, we can then select a targeted intervention adding context or deliberation to most effectively reduce uncertainty on each item being judged. We test our approach on two tasks: rating word pair similarity and toxicity of online comments, showing that targeted interventions reduced uncertainty for the most uncertain cases. In the top 10% of cases, we saw an ambiguity reduction of 21.4% and 25.7%, and a disagreement reduction of 22.2% and 11.2% for the two tasks respectively. We also found through a simulation that our targeted approach reduced the average uncertainty scores for both sources of uncertainty as opposed to uniform approaches where reductions in average uncertainty from one source came with an increase for the other.
翻译:当群体被要求做出判断时,不确定性问题往往随之产生。现有降低不确定性的方法通常侧重于逐步改进整体任务指令的明确性。然而,不确定性可能源于多重因素,例如因上下文不足导致的被判断对象存在歧义,或因视角差异与任务规范不充分引发的参与者间分歧。若未针对不确定性的正确源头,采用一刀切的干预措施可能无效。本文提出一种新工作流程——判断筛,通过针对性方式降低涉及群体判断任务中的不确定性。利用初始判断 elicitation 阶段分离不同不确定性源的测量结果,我们可针对每个被判断项目选择添加上下文或进行讨论的定向干预,从而最有效地降低不确定性。我们在两个任务上测试了该方法:词对相似度评级与在线评论毒性评级,结果表明针对性干预可降低最不确定案例的不确定性。在10%的高不确定性案例中,两项任务分别实现了21.4%与25.7%的歧义降低,以及22.2%与11.2%的分歧降低。通过模拟实验我们还发现,与统一干预方式(其中一种不确定性源的平均降低以另一种不确定性源的增加为代价)相比,我们的针对性方法同时降低了两种不确定性源的平均不确定性得分。