Human annotation is a time-consuming task that requires a significant amount of effort. To address this issue, interactive data annotation utilizes an annotation model to provide suggestions for humans to approve or correct. However, annotation models trained with limited labeled data are prone to generating incorrect suggestions, leading to extra human correction effort. To tackle this challenge, we propose Araida, an analogical reasoning-based approach that enhances automatic annotation accuracy in the interactive data annotation setting and reduces the need for human corrections. Araida involves an error-aware integration strategy that dynamically coordinates an annotation model and a k-nearest neighbors (KNN) model, giving more importance to KNN's predictions when predictions from the annotation model are deemed inaccurate. Empirical studies demonstrate that Araida is adaptable to different annotation tasks and models. On average, it reduces human correction labor by 11.02% compared to vanilla interactive data annotation methods.
翻译:人工标注是一项耗时且需要大量精力的任务。为解决这一问题,交互式数据标注利用标注模型提供建议供人类批准或修正。然而,基于有限标注数据训练的标注模型容易产生错误建议,导致额外的人工修正工作。为应对这一挑战,我们提出Araida——一种基于类比推理的方法,可在交互式数据标注场景中提升自动标注准确性并减少人工修正需求。Araida包含一种错误感知集成策略,可动态协调标注模型与k近邻(KNN)模型的协作,当标注模型的预测被认为不可靠时,赋予KNN预测更高的权重。实证研究表明,Araida能适应不同的标注任务和模型。与原始交互式数据标注方法相比,其平均减少11.02%的人工修正工作量。