HateGPT: Unleashing GPT-3.5 Turbo to Combat Hate Speech on X

from arxiv, Accepted at FIRE 2024 (Track: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages (HASOC)). arXiv admin note: text overlap with arXiv:2411.05039, arXiv:2411.06946

The widespread use of social media platforms like Twitter and Facebook has enabled people of all ages to share their thoughts and experiences, leading to an immense accumulation of user-generated content. However, alongside the benefits, these platforms also face the challenge of managing hate speech and offensive content, which can undermine rational discourse and threaten democratic values. As a result, there is a growing need for automated methods to detect and mitigate such content, especially given the complexity of conversations that may require contextual analysis across multiple languages, including code-mixed languages like Hinglish, German-English, and Bangla. We participated in the English task where we have to classify English tweets into two categories namely Hate and Offensive and Non Hate-Offensive. In this work, we experiment with state-of-the-art large language models like GPT-3.5 Turbo via prompting to classify tweets into Hate and Offensive or Non Hate-Offensive. In this study, we evaluate the performance of a classification model using Macro-F1 scores across three distinct runs. The Macro-F1 score, which balances precision and recall across all classes, is used as the primary metric for model evaluation. The scores obtained are 0.756 for run 1, 0.751 for run 2, and 0.754 for run 3, indicating a high level of performance with minimal variance among the runs. The results suggest that the model consistently performs well in terms of precision and recall, with run 1 showing the highest performance. These findings highlight the robustness and reliability of the model across different runs.

翻译：Twitter和Facebook等社交媒体平台的广泛使用，使得各年龄段人群能够分享其观点与经历，从而导致用户生成内容的海量积累。然而，在享受这些益处的同时，此类平台也面临着管理仇恨言论与冒犯性内容的挑战，这些内容可能破坏理性对话并威胁民主价值观。因此，对自动化方法以检测和缓解此类内容的需求日益增长，特别是考虑到对话的复杂性往往需要跨多种语言（包括像印度英语、德英混合语和孟加拉语这样的语码混合语言）进行语境分析。我们参与了英语分类任务，需将英文推文划分为"仇恨与冒犯性"和"非仇恨冒犯性"两类。本研究通过提示工程，实验性地采用GPT-3.5 Turbo等前沿大语言模型对推文进行二元分类。我们在三次独立实验中采用Macro-F1分数评估分类模型性能。Macro-F1分数通过平衡所有类别的精确率与召回率，被用作模型评估的核心指标。三次实验获得的分数分别为：实验1得0.756，实验2得0.751，实验3得0.754，这表明模型在保持高性能的同时具有极低的实验间方差。结果证明该模型在精确率与召回率方面表现稳定，其中实验1展现出最优性能。这些发现凸显了模型在不同实验中的鲁棒性与可靠性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日