Human annotation is a time-consuming task that requires a significant amount of effort. To address this issue, interactive data annotation utilizes an annotation model to provide suggestions for humans to approve or correct. However, annotation models trained with limited labeled data are prone to generating incorrect suggestions, leading to extra human correction effort. To tackle this challenge, we propose Araida, an analogical reasoning-based approach that enhances automatic annotation accuracy in the interactive data annotation setting and reduces the need for human corrections. Araida involves an error-aware integration strategy that dynamically coordinates an annotation model and a k-nearest neighbors (KNN) model, giving more importance to KNN's predictions when predictions from the annotation model are deemed inaccurate. Empirical studies demonstrate that Araida is adaptable to different annotation tasks and models. On average, it reduces human correction labor by 11.02% compared to vanilla interactive data annotation methods.
翻译:人工标注是一项耗时且需要大量精力的任务。为应对这一问题,交互式数据标注利用标注模型为人工提供建议以供确认或修正。然而,在有限标注数据上训练的标注模型容易产生错误建议,从而导致额外的人工修正负担。为解决这一挑战,我们提出Araida——一种基于类比推理的方法,该方法在交互式数据标注场景中提升自动标注准确性,并减少人工修正需求。Araida采用一种错误感知集成策略,动态协调标注模型与k近邻(KNN)模型,当标注模型的预测被认为不准确时,会赋予KNN的预测更高权重。实证研究表明,Araida能适应不同的标注任务与模型。与基础交互式数据标注方法相比,其平均可减少11.02%的人工修正工作量。