Many annotation tasks in natural language processing are highly subjective in that there can be different valid and justified perspectives on what is a proper label for a given example. This also applies to the judgment of argument quality, where the assignment of a single ground truth is often questionable. At the same time, there are generally accepted concepts behind argumentation that form a common ground. To best represent the interplay of individual and shared perspectives, we consider a continuum of approaches ranging from models that fully aggregate perspectives into a majority label to "share nothing"-architectures in which each annotator is considered in isolation from all other annotators. In between these extremes, inspired by models used in the field of recommender systems, we investigate the extent to which architectures that include layers to model the relations between different annotators are beneficial for predicting single-annotator labels. By means of two tasks of argument quality classification (argument concreteness and validity/novelty of conclusions), we show that recommender architectures increase the averaged annotator-individual F$_1$-scores up to $43\%$ over a majority label model. Our findings indicate that approaches to subjectivity can benefit from relating individual perspectives.
翻译:自然语言处理中的许多标注任务具有高度主观性,即对于给定示例的合理标签可能存在不同有效且合理的视角。这一点同样适用于论证质量判断,其中单一真实标签的分配往往存疑。与此同时,论证背后存在普遍认可的概念构成共识基础。为最佳呈现个体与共享视角的交互,我们考虑了一个从完全聚合视角形成多数标签的模型,到将每个标注者与其他标注者孤立考虑的"零共享"架构的连续方法谱系。介于这两种极端之间,受推荐系统领域模型启发,我们研究了包含标注者关系建模层的架构在预测单标注者标签方面的优势程度。通过论证质量分类(论证具体性与结论有效性/新颖性)的两项任务,我们证明推荐架构能将标注者个体F₁分数均值较多数标签模型提升高达43%。研究结果表明,处理主观性问题的方法可从关联个体视角中获益。