We introduce the Guideline-Centered Annotation Methodology (GCAM), a novel data annotation methodology designed to report the annotation guidelines associated with each data sample. Our approach addresses three key limitations of the standard prescriptive annotation methodology by reducing the information loss during annotation and ensuring adherence to guidelines. Furthermore, GCAM enables the efficient reuse of annotated data across multiple tasks. We evaluate GCAM in two ways: (i) through a human annotation study and (ii) an experimental evaluation with several machine learning models. Our results highlight the advantages of GCAM from multiple perspectives, demonstrating its potential to improve annotation quality and error analysis.
翻译:我们提出了指南中心标注方法(GCAM),这是一种新颖的数据标注方法,旨在记录与每个数据样本相关联的标注指南。我们的方法通过减少标注过程中的信息损失并确保对指南的遵循,解决了标准规范性标注方法的三个关键局限。此外,GCAM能够高效地跨多个任务复用已标注的数据。我们通过两种方式评估GCAM:(i)一项人工标注研究,以及(ii)使用多种机器学习模型进行的实验评估。我们的结果从多个角度凸显了GCAM的优势,证明了其在提升标注质量和错误分析方面的潜力。