Detecting problematic content, such as hate speech, is a multifaceted and ever-changing task, influenced by social dynamics, user populations, diversity of sources, and evolving language. There has been significant efforts, both in academia and in industry, to develop annotated resources that capture various aspects of problematic content. Due to researchers' diverse objectives, the annotations are inconsistent and hence, reports of progress on detection of problematic content are fragmented. This pattern is expected to persist unless we consolidate resources considering the dynamic nature of the problem. We propose integrating the available resources, and leveraging their dynamic nature to break this pattern. In this paper, we introduce a continual learning benchmark and framework for problematic content detection comprising over 84 related tasks encompassing 15 annotation schemas from 8 sources. Our benchmark creates a novel measure of progress: prioritizing the adaptability of classifiers to evolving tasks over excelling in specific tasks. To ensure the continuous relevance of our framework, we designed it so that new tasks can easily be integrated into the benchmark. Our baseline results demonstrate the potential of continual learning in capturing the evolving content and adapting to novel manifestations of problematic content.
翻译:检测仇恨言论等问题内容是一项多层面且不断变化的任务,受到社会动态、用户群体、来源多样性以及语言演变的影响。学术界和工业界已投入大量努力,开发捕捉问题内容各方面的标注资源。由于研究者的目标各异,这些标注存在不一致性,因此问题内容检测方面的进展报告呈现出碎片化状态。除非我们考虑到问题的动态性而整合资源,否则这一模式预计将持续存在。我们建议整合现有资源,并利用其动态特性来打破这一模式。本文引入了一个用于问题内容检测的持续学习基准和框架,涵盖来自8个数据源的84个相关任务,涉及15种标注方案。我们的基准提出了一种新的进展衡量标准:优先考虑分类器对不断演变的任务的适应性,而非在特定任务上的卓越表现。为确保框架的持续相关性,我们将其设计为易于将新任务集成到基准中。我们的基线结果证明了持续学习在捕捉不断演变的内容以及适应问题内容新表现形式方面的潜力。