The laborious and costly nature of affect annotation is a key detrimental factor for obtaining large scale corpora with valid and reliable affect labels. Motivated by the lack of tools that can effectively determine an annotator's reliability, this paper proposes general quality assurance (QA) tests for real-time continuous annotation tasks. Assuming that the annotation tasks rely on stimuli with audiovisual components, such as videos, we propose and evaluate two QA tests: a visual and an auditory QA test. We validate the QA tool across 20 annotators that are asked to go through the test followed by a lengthy task of annotating the engagement of gameplay videos. Our findings suggest that the proposed QA tool reveals, unsurprisingly, that trained annotators are more reliable than the best of untrained crowdworkers we could employ. Importantly, the QA tool introduced can predict effectively the reliability of an affect annotator with 80% accuracy, thereby, saving on resources, effort and cost, and maximizing the reliability of labels solicited in affective corpora. The introduced QA tool is available and accessible through the PAGAN annotation platform.
翻译:情感标注的费时费力特性是获取具有有效可靠情感标签的大规模语料库的关键障碍。鉴于缺乏能有效判断标注者可靠性的工具,本文针对实时连续标注任务提出了通用的质量保证测试方法。假设标注任务依赖于包含视听成分的刺激材料(如视频),我们提出并验证了两项质量保证测试:视觉质量保证测试与听觉质量保证测试。我们通过20名标注者验证该质量保证工具——这些标注者需先完成测试,随后执行一项关于游戏视频参与度标注的冗长任务。研究结果毫不意外地表明:经过训练的标注者比所能雇佣的最优未经训练众包工作者更可靠。更重要的是,所引入的质量保证工具能以80%的准确率有效预测情感标注者的可靠性,从而节省资源、人力与成本,并最大化情感语料库中标签的可靠性。该质量保证工具已通过PAGAN标注平台开放获取。