In today's digital world, social media plays a significant role in facilitating communication and content sharing. However, the exponential rise in user-generated content has led to challenges in maintaining a respectful online environment. In some cases, users have taken advantage of anonymity in order to use harmful language, which can negatively affect the user experience and pose serious social problems. Recognizing the limitations of manual moderation, automatic detection systems have been developed to tackle this problem. Nevertheless, several obstacles persist, including the absence of a universal definition for harmful language, inadequate datasets across languages, the need for detailed annotation guideline, and most importantly, a comprehensive framework. This study aims to address these challenges by introducing, for the first time, a detailed framework adaptable to any language. This framework encompasses various aspects of harmful language detection. A key component of the framework is the development of a general and detailed annotation guideline. Additionally, the integration of sentiment analysis represents a novel approach to enhancing harmful language detection. Also, a definition of harmful language based on the review of different related concepts is presented. To demonstrate the effectiveness of the proposed framework, its implementation in a challenging low-resource language is conducted. We collected a Persian dataset and applied the annotation guideline for harmful detection and sentiment analysis. Next, we present baseline experiments utilizing machine and deep learning methods to set benchmarks. Results prove the framework's high performance, achieving an accuracy of 99.4% in offensive language detection and 66.2% in sentiment analysis.
翻译:在当今数字世界中,社交媒体在促进沟通和内容共享方面发挥着重要作用。然而,用户生成内容的指数级增长给维护文明网络环境带来了挑战。部分用户利用匿名性使用有害语言,这既影响用户体验,又引发严重社会问题。鉴于人工审核的局限性,研究者开发了自动检测系统。但当前仍存在若干障碍:缺乏有害语言的统一定义、跨语言数据集不足、需要详细的标注指南,以及最重要的——缺乏综合框架。本研究首次提出一种可适应任何语言的详细框架,涵盖有害语言检测的多个维度。该框架的核心是制定通用且详细的标注指南,并将情感分析作为增强有害语言检测的创新方法。同时,基于对相关概念的梳理给出了有害语言的定义。为验证所提框架的有效性,我们在低资源语言上进行了实施:收集波斯语数据集并应用标注指南进行有害检测和情感分析,通过机器学习和深度学习方法设置基准实验。结果表明该框架性能优异,在攻击性语言检测中准确率达99.4%,情感分析准确率达66.2%。