A preference-based subjective evaluation is a key method for evaluating generative media reliably. However, its huge combinations of pairs prohibit it from being applied to large-scale evaluation using crowdsourcing. To address this issue, we propose an automatic optimization method for preference-based subjective evaluation in terms of pair combination selections and allocation of evaluation volumes with online learning in a crowdsourcing environment. We use a preference-based online learning method based on a sorting algorithm to identify the total order of evaluation targets with minimum sample volumes. Our online learning algorithm supports parallel and asynchronous execution under fixed-budget conditions required for crowdsourcing. Our experiment on preference-based subjective evaluation of synthetic speech shows that our method successfully optimizes the test by reducing pair combinations from 351 to 83 and allocating optimal evaluation volumes for each pair ranging from 30 to 663 without compromising evaluation accuracies and wasting budget allocations.
翻译:偏好型主观评价是可靠评估生成式媒体的关键方法。然而,其庞大的配对组合数量阻碍了该方法在基于众包的大规模评估中的应用。为解决这一问题,我们提出了一种在众包环境中通过在线学习自动优化偏好型主观评价的方法,具体涉及配对组合选择与评估工作量分配。我们采用基于排序算法的偏好型在线学习方法,以最小样本量识别评估目标的全局排序。该在线学习算法支持众包所需的固定预算条件下的并行与异步执行。在合成语音的偏好型主观评价实验中,我们的方法成功将配对组合从351对减少至83对,并为每对评估分配了30至663之间的最优评估工作量,在保证评估精度的同时避免了预算浪费。