Algorithmic hate speech detection faces significant challenges due to the diverse definitions and datasets used in research and practice. Social media platforms, legal frameworks, and institutions each apply distinct yet overlapping definitions, complicating classification efforts. This study addresses these challenges by demonstrating that existing datasets and taxonomies can be integrated into a unified model, enhancing prediction performance and reducing reliance on multiple specialized classifiers. The work introduces a universal taxonomy and a hate speech classifier capable of detecting a wide range of definitions within a single framework. Our approach is validated by combining two widely used but differently annotated datasets, showing improved classification performance on an independent test set. This work highlights the potential of dataset and taxonomy integration in advancing hate speech detection, increasing efficiency, and ensuring broader applicability across contexts.
翻译:算法仇恨言论检测面临重大挑战,主要源于研究和实践中采用的定义与数据集存在显著差异。社交媒体平台、法律框架及各类机构各自采用相互区别却又部分重叠的定义体系,这给分类工作带来了复杂性。本研究通过论证现有数据集与分类体系可整合至统一模型,有效提升了预测性能并降低了对多个专用分类器的依赖,从而应对上述挑战。本工作提出了一种通用分类体系及相应的仇恨言论分类器,能够在单一框架内检测广泛的定义范畴。我们通过整合两个广泛使用但标注方式不同的数据集验证了该方法的有效性,在独立测试集上展现了改进的分类性能。此项研究凸显了数据集与分类体系整合在推进仇恨言论检测、提升效率及确保跨情境广泛适用性方面的潜力。