Majority voting (MV) is the prototypical ``wisdom of the crowd'' algorithm. Theorems considering when MV is optimal for group decisions date back to Condorcet's 1785 jury \emph{decision} theorem. The same error independence assumption underlying the theorem can be used to prove a jury \emph{evaluation} theorem that does purely algebraic evaluation (AE) of juror performance based on a batch of their decisions. Three or more binary jurors are enough to obtain the only two possible statistics of their correctness on a test they took. AE is superior to MV in three ways. First, its empirical assumptions are looser and can handle jurors less than 50\% accurate in making decisions. Second, it has point-like precision in evaluating them given its assumption of error independence. This precision enables a multi-accuracy approach that has higher labeling accuracy than MV and comes with empirical uncertainty bounds. And, third, it is self-alarming about the failure of its error independence assumption. Experiments using demographic data from the American Community Survey confirm the practical utility of AE over MV. Two implications of the theorem for AI safety are discussed - a principled way to terminate infinite monitoring chains (who grades the graders?) and the super-alignment problem (how do we evaluate agents doing tasks we do not understand?).
翻译:多数投票(MV)是典型的“群体智慧”算法。关于MV何时在群体决策中达到最优的定理可追溯至孔多塞1785年的陪审团决策定理。基于该定理所依赖的误差独立性假设,可以证明一个陪审团评估定理,该定理仅基于陪审员的一批决策对其表现进行纯代数评估(AE)。仅需三个或更多二元陪审员即可获得他们在测试中正确性的唯二可能统计量。AE在三个方面优于MV:首先,其经验假设更为宽松,能够处理决策准确率低于50%的陪审员;其次,在误差独立性假设下,其评估具有点状精度。这种精度支持一种多精度方法,其标注准确率高于MV,且附带经验不确定性边界;第三,当误差独立性假设失效时,该方法能自主发出警报。基于美国社区调查人口数据的实验证实了AE相较于MV的实际效用。本文进一步探讨了该定理对AI安全的两个启示——为终止无限监控链(谁来评估评估者?)提供了原则性方法,以及超级对齐问题(如何评估执行我们无法理解任务的智能体?)。